Machine Translation Review
No. 13, December 2002
ISSN: 1358-8346
This page URL: http://www.bcs.org.uk/siggroup/nalatran/mtreview/mtr-13/6.htm
Report on the 6th EAMT Workshop
Teaching Machine Translation
November 14-15 2002: UMIST, Manchester, England
Organised by the
European Association for Machine Translation
in association with
the British Computer Society Natural Language Translation Specialist Group
by Judith Belam and Derek Lewis, University of Exeter
Synopses of the Papers
Orie Miyazawa, University of Wolverhampton (MT Training for Business People and Translators), described how the PAT MT translation system (bi-directional English-Japanese) is used to provide in-house training for translators, managers and engineers at Epson Telford Ltd UK and to promote the use of machine translation. She outlined the types of documents used by the company and stressed the importance of pre- and post-editing in the translation process. Some basic rules for using a controlled language were also included.
Sivaji Bandyophadhyay, Jadavpur University (Teaching MT: An Indian Perspective), was unable to attend the workshop in person, but his paper was delivered by a member of UMIST. He drew attention to various factors affecting the teaching of MT, in particular the state of MT research in the country, language policy and patterns in the education system, student motivation and the status of the software industry in a multilingual setting. He described in detail the state of MT in India and referred to a number of institutions in which MT is included in the computer science or computational linguistics curriculum.
Elia Yuste, University of Zurich (MT and the Swiss Language Service Providers), outlined the linguistic situation in Switzerland with particular reference to the growing need for translation and multilingual documentation. Awareness and usage of computer mediated translation tools, however, proved to be surprisingly low in Switzerland. The author presented the results of a questionnaire distributed to language professionals and to service providers. They demonstrated the extent to which computer-based and CAT tools are employed in Switzerland and highlighted the need for awareness raising and training.
Dimitri Kalantzi, UMIST (Teaching MT and Computer-Assisted Translation Tools in Greece), described the results of her survey of teaching and training in MT/CAT in private schools and university departments in Greece. Although currently at relatively low levels, there are signs that education in MT is beginning to take off, mainly at undergraduate level. Obstacles to improvement include lack of government funding, low awareness levels, and lack of trained staff and established teaching posts; in addition, few Greek universities offer degrees in translation for undergraduates/postgraduates. The paper concluded with recommendations for change.
Paul Bennet, UMIST (Teaching Contrastive Linguistics for MT), presented a framework for teaching contrastive-oriented linguistics to students of machine translation. He included a classification of cross-linguistic differences and showed how such differences can be neutralised in abstract representations in order to simplify the operation of transfer in MT. He described a canonical form for classifying language differences across languages and outlined how it can be used to identify diverging features such as movement, insertion and omission which can then be accommodated in formulating simplified transfer modules. Finally, he discussed the model’s relevance for a broader typological foundation of language differences.
Andy Way, Dublin City University (Testing Students' Understanding of Complex Transfer), focussed on how to provide students with an understanding of the computational process of complex transfer in MT systems. Students are required to identify and classify instances of complex transfer, draw up dependency structures, write translation rules in a notation of their choice, and construct a Prolog-type rule which generates the translation from an input string. The methodology is appropriate for those teaching more advanced programming skills.
J. Gabriel Amores, University of Seville (Teaching MT with Xepisteme), described Xepisteme, a tool for developing transfer-based MT systems. Based on lexical functional grammar it is being used to teach MT at the universities of Seville and Pempeu Fabra in Spain. The tool’s advantages are that it gives the user the control to monitor and manipulate each processing stage (analysis, transfer, or full translation) and to plug in different components (e.g. associate a lexicon with different grammars). With no programming expertise the user can also develop new modules, including lexicons and grammar processing rules. Plans are afoot to implement a MS Windows version, to provide a direct MT (as opposed to transfer) version, and to link the transfer module to WordNet.
Walther von Hahn and Christina Vertan, University of Hamburg (Architectures of ‘Toy’ Systems for Teaching Machine Translation), described how students, provided with a small, annotated lexicon (based on a 100-sentence test corpus), were given the task of designing and implementing an MT system and its various modules. Students also evaluated their systems and the educational success of the approach.
Svetlana Sheremeteyva, Handelshøjskolen i København (An MT Learning Environment for Computational Linguistics Students), argued for MT software specifically designed for training purposes. She presented the APTrans system (an experimental interactive MT system for translating patent claims between any pair of European languages) and its developer tools as a possible model.
Chi-Chinag Shei, National Tsing Hua University (Teaching MT through Pre-editing), outlined some of the major differences between Chinese and English, with special reference to areas of structural, lexical and semantic difficulty, and illustrated how students in Taiwan were encouraged to develop pre-editing skills in order to improve output from an English-Chinese MT system. The process not only familiarised students with MT technology but also promoted their competence in English composition.
Pointing to the growing global demand for post-editing, Sharon O’Brien from the Dublin City University (Teaching Post-editing: A Proposal for Course Content) discussed the skills which a post-editor requires and the need for post-editors to understand the technologies of MAT and also relationship between pre-editing and MT output. She outlined the elements of a proposed course in post-editing.
Enrique Torrejón and Celia Rico, Universidad Europea (Controlled Translation: A New Teaching Scenario Tailor-Made for the Translation Industry), discussed in some detail the concept of controlled languages, reviewed practical examples of CLs that have been used in industry, and outlined guidelines and writing rules. They described the methodology and types of practical exercise underpinning a new course at the European University in Madrid which aims to teach the pre- and post-editing skills that will be increasingly required from professional translators.
The paper by Heather Fulford, University of Loughborough (Freelance Translators and Machine Translation), described the results of an exploratory study undertaken to establish the uptake of MT among freelance translators in the UK. Preliminary findings suggested that a relatively low uptake of MT was matched by a keen interest in learning more about MT and in seeing the development of resources that could provide self-directed training.
Mark Shuttleworth, Imperial College (Combining MT and CAT on a Technology-Oriented Translation Masters), reported on a project to evaluate how students on a masters-level MT programme performed in using a combination of MT and TM technologies to translate a large extract of a document on medical information. He concluded that the project gave students an important learning experience, although outcomes might depend on the particular software tools used (e.g. TRADOS v. Déjà Vu, and SYSTRAN) and on their interaction.
Judith Belam, University of Exeter (Teaching MT Evaluation by Assessed Portfolio), described her experiences of an assessed independent study project on MT evaluation for final-year undergraduates on a modern languages degree. After discussing the rationale of teaching MT, she presented several reasons why the project extended students in terms of motivation, competence in the foreign language, and transferable skills.
Mikel Forcada, University of Alicante (Explaining Real MT to Translators: Compositional Semantics and Word-for-Word), explored the advantages of presenting MT to students as a middle way between the theoretically motivated model of semantic compositionality and a set of refinements over basic word-for-word substitution. He aimed to influence students’ expectations of MT by showing them the differences between the compromises required to implement a linguistically accurate system and the volume of work needed in order to bring a word-for-word model (model zero) to reasonable levels of performance.
Federico Gaspari, UMIST (Using Free On-line Services in MT Teaching), discussed the role that free on-line machine translation services can play in teaching MT. After outlining their advantages (freely available in many language pairs) and disadvantages (no access to rules or dictionaries for customisation), he focuses on the increasing need for web localisation and multilingual on-line content management. In conclusion he proposed a broad approach in teaching MT to students which emphasises the role in communication that on-line MT can play rather than its technical or linguistic performance.
Natalie Kübler, University of Paris 7 (Teaching Commercial MT to Translators: Bridging the Gap between Human and Machine), presented an experiment conducted at the University of Paris which aimed to show translation trainees how and why MT can be incorporated into their work. She described the translation projects which the students undertook and the language data and tools which they used in carrying the projects through to submission to the customer. The data comprised specialised dictionaries and corpora (generally available on a web-based interface) and the tools included terminology extraction software, a concordancer, and MT (Systran). Students worked in groups to create customised dictionaries, evaluate the MT system’s performance and post-edit the results. As a result students were better able to grasp the advantages of MT and appreciate why general purpose MT is hard to achieve. They were also able to place MT in context alongside human translation.
Who Uses MT?
Apart from its orientation towards teaching MT the workshop also offered some answers to the question of who actually uses MT in practice. After a couple of sessions with a proprietary MT system it might be tempting to conclude that no-one could ever seriously use MT. However, the workshop demonstrated conclusively that such a conclusion would be premature. Orie Miyazawa described how a UK subsidiary of the Japan-based Epson Company, Epson Telford, manufactures ink cartridges for consumer inkjet printers, with all manufacturing methods and procedures originating from the headquarters in Japan. Senior positions in the company are occupied by Japanese staff from the head office, some of whom require assistance in communication. The translation requirement is therefore extremely high and the requirements cannot always be met by human translation. The author stressed the importance of training translators in the use of MT, the importance of pre-and post-editing, and devoted a long section to discussing the type of documents and their suitability for translation by MT. She identified some document types as being unsuitable (e.g. presentations made for senior board members from Japan, minutes of meetings) and other as being eminently suitable (e.g. parts lists; with often a simple word-for-word translation requiring entry of the names of the parts into the dictionary). In other cases (e.g. engineering instructions) the decision was unclear: on the one hand, accuracy is vital, while on the other hand the material often need to be translated very quickly. With regard to e-mails, some need thorough translation if the information is very important; while others include communications about company events and information, which is generally simple and straightforward and does not require a high quality translation, especially since all the addressees are members of the company.
In conversation with Viggo Hansen, whose company, Zacco A/S (Hellerup, Denmark), translates patents into Danish, it emerged that a patent becomes valid in Denmark only after it is translated into Danish. Naturally this generates a large and on-going requirement for large amounts of translation. Mr Hansen, who was required to translate approximately 3 million words a year, explained that he valued the speed and consistency of MT for this task, and stressed the ease of post-editing predictable MT errors compared with the difficulty of correcting human translation with its unpredictable errors, misinterpretations and omissions.
In the light of the above it might be expected that MT is becoming more widespread. There were, however, some surprises here too. Elia Yuste (University of Zurich) had begun a survey of the translation industry in Switzerland, where one might expect the large translation requirement in a multilingual nation to generate a large demand for MT. Surprisingly, perhaps, her survey has shown so far that, ‘with the exception of two leading corporate language service providers who have performed evaluation exercises of MT systems and adopted one in their workflow, there is no overall interest in MT in the Swiss translation arena. Regrettably, fears and misconception about MT still remain’.
In India, too, it emerged from Sivaji Bandyopadhyay’s paper that the use of MT does not seem to be widespread, although he stressed the need for it: According to the author India is a vast multilingual country where there are 18 scheduled languages. There is a great demand for translation of documents from one language to another. Most of the state governments work in the respective regional languages whereas the Union Governments official documents and reports are in bilingual form (Hindi/English). In order to have a proper communication there is a need to translate these reports and documents into the respective regional languages. With the limitations of human translators most of this information is missing and is not percolating down. A machine assisted translation system or a translators’ workbench would increase the efficiency of human translators.
MT and Professional Translators
Another interesting question was the degree to which professional translators are informed about MT. The survey by Heather Fulford (Loughborough University Business School) suggested that only 7% were actively using MT, although over half had been asked to do post-editing work. There was a general feeling that more training in MT would be useful to them.
Many of the speakers at the workshop were engaged in training translators, and there were proposals about what trainee translators should be learning. Several areas came up for consideration. One of these was pre-editing and controlled languages. Enrique Torrejòn and Celia Rico from the Universidad Europea de Madrid talked about the importance of controlled languages and the fact that their use is becoming more widespread, thus giving rise to a whole new area of professional translation which the authors called ‘controlled translation’. In their words: ‘Pre-editing texts following the constraints from the controlled language specifications or even writing texts directly in CL become a new set of skills that translators need to take into account.’ Chi-chiang Shei stressed the importance of pre-editing when translation is required between two languages whose structure differs very markedly, such as English and Chinese.
A second area was that of post-editing. Sharon O’Brien from Dublin City University proposed including a component in training post-editors into a course for translators. She underlined the value of post-editing as a way of making the most of the speed and efficiency of MT, quoting a study by Vasconcellos and Léon which claims that ‘a full-time trained post-editor working on-screen, can produce polished, standard quality output at a rate of between two and three times faster than traditional translation’. She went on to state that post-editing was a skill which needed to be learned and gave some suggestions for course content.
A third area had to do with understanding how an MT systems works at a technical level. Andy Way from Dublin City University described the need for all students of MT to have an idea of how the system actually operates. His own students were enrolled on a three-year course in computational linguistics, although he pointed out that ‘Any contemporary course on MT ought to equip students with knowledge of the differences between rule-based and statistical MT; direct and indirect approaches; and transfer-based and interlingual systems.’ He went on to talk about the issue of complex transfer, ‘an integral component of this section of a course on MT’ and which referred to those difficult cases of translation where the grammatical structure diverges considerably between languages (as in German Es gefällt mir for English I like it).
|