PRACTICAL ASPECTS OF THE USE OF METAL AT SIEMENS NIXDORF
by Keith Roberts
Abstract
The following account is based on a lecture given by Keith Roberts, Manager of Language Services, Siemens Group Services Limited, Chertsey, at King's College London at the AGM of the Natural Language Translation Specialist Group of the BCS on 22 September 1994.
The METAL machine translation system originated at the Linguistics Research Center in Texas, USA in 1959-60, when work on a German-English system was initiated. The German company Siemens took over the project in 1979 when it entered the American market for analog switching equipment in the telecommunications industry. LRC's involvement ceased in 1994, and the system is now under the control of Siemens Nixdorf (Sietec), who continue to use it within the framework of their Munich- and Paderborn-based translation service, the Sprachendienst.
The Sprachendienst employs about 80 people, has around 2,200 customers, and translates approximately thirty-four million words a year in the fields of information technology and telecommunications. The service is organised as a distributed translation environment, the local production centres having remote access to METAL and local access to the TWIN terminology management system.
Some time was devoted to describing the key project stages involved in translation at Siemens Nixdorf and to the integrated translation process (or 'process chain' as it is often referred to) at the Sprachendienst.
Particular emphasis was placed on how Machine Translation can be integrated into the documentation process:
MT as a service
- Various levels of post-editing in line with customer requirements
- Terminology projects for subsequent MT usage
- Project solutions
- Consultancy, including training
Written in LISP, METAL has been ported to a SUN platform and is capable of processing up to 1,000 pages a day in machine-time. The METAL Database Management System comprises four modules: the source language (SL) dictionary, the transfer dictionary, the target language (TL) dictionary, and the grammar rules. The actual number of grammar or linguistic processing rules is relatively small (about 550).
The typical sequence of operations for throughputting text in METAL was described. The first phase is text acquisition and preparation, which involves deformatting the source text, i.e. removing features of layout and reducing the text to translation units comprising its constituent words. The second phase is lexical analysis, or dictionary look-up; any unknown terms are entered manually or semi-automatically into the lexicon. Translation is performed during the third or machine run phase. The fourth and final phase consists of post-editing and reformatting.
For the linguistic representation of the source language, METAL employs a dependency tree model. The system generates a parse tree for the SL sentence, which is then converted into a representation of the TL sentence by the transfer component. To translate compound noun phrases, a terminology look-up component searches for the longest string in the SL for which there is also a TL equivalent.
No presentation on the METAL Machine Translation System can dispense with at least a broad outline of the theory of MT, but thereafter the talk concentrated strictly on practical experience gained within the Sprachendienst over a number of years:
- Where MT works best
In SNI's experience specific subject areas and language pairs are required, involving high-volume texts and, wherever possible, the use of restricted language, all of which should be coupled with a continuous assessment of output.
- Suitability for MT
Emphasis was placed on the need to assess the source text for MT friendliness in terms of language pairing, document style, volume, subject area etc. Concrete examples of both good and poor quality MT translations were presented and discussed.
- Typical problems
Examples were given of some of the difficulties encountered at the stages of pre-editing (for example: terminology not coded adequately, non-translatable units not masked) and post-editing (for example: level of post-editing needed, text editor used, experience and attitude of human post-editor).
- Potential solutions
Recent efforts by the Sprachendienst to introduce the use of controlled language among its key accounts were described in some detail. 'Writing with the translation process in mind' concentrates on both human translation (at sentence and word level) and MT (for example: no combinations of words and special characters such as <$Date>). While a certain amount of success has been obtained to date, it is at times proving difficult to quantify the benefits, and technical authors and developers tend to perceive this approach as a restriction on their creativity.
How SNI's Sprachendienst has attempted to introduce MT into a practical translation environment was set out in a detailed description of an in-house development known as DTS (Distributed Translation and Terminology Services). DTS can be accessed quickly and simply by translators, suppliers and customers via modem and LAN and offers a whole range of services, such as
- Access to MT (METAL)
- Access to terminology services (glossary compilation, dictionary look-up and updating etc.)
- Automatic volume count
- Automated assessment of MT friendliness
- Link to invoicing system
The final part of the talk was devoted to the issue of Project Costing, demonstrating how costings are built up at SNI with a view to ensuring — in certain cases substantial — cost savings for the customer through the appropriate use of MT/DTS.
In conclusion, it was clear that METAL is being used commercially within the Siemens organisation. It is regarded as one of a set of tools which may contribute to improving operational efficiency; as such its output is subject to continuous assessment and evaluation.
|