|
Report on Commercial Machine Translation in a Manufacturing Industry Domain
by Orie Fukutomi,
School of Humanities, Languages and Social Sciences,
University of Wolverhampton, United Kingdom.
E-mail: n9327544@wlv.ac.uk
Abstract
The aim of this paper is to provide a report on an experiment using a Commercial Machine Translation (CMT) software in a manufacturing company in the UK, with particular reference to Japanese/English Machine translation. It presents the main difficulties involved in the translation of industrial documents from Japanese to English and discusses how the productivity and quality of translation can be improved through the use of commercial Machine Translation (MT) software. They are proposed from a translator's point of view in a manufacturing factory. The survey focuses on a manufacturing organisation which does not have the resources needed to develop their own MT system. The globalisation of the Japanese manufacturing industry makes it necessary for the translation of manuals and other documents to be as rapid as possible. In this paper, linguistic features of both English and Japanese are discussed from the evaluation experiment in order to make up writing rules for members of staff at Makita Manufacturing Europe. It also discusses the viewpoint of British engineers when translated manuals are read. The result of the experiment is also examined in terms of whether it reflects the theoretical viewpoint or not.
1.. Outline Structure
The disappointment expressed about translations generated by MT has not stopped a substantial number of people from using it for serious documentation. Although it is still impossible for MT to process natural language completely, the level achieved is significant. Yet, not many users are informed about the properties or strengths of MT. In this paper, one Commercial Machine Translation software package, Honyaku Adapter II (HAII) from NEC, has been chosen in order to demonstrate how it can be utilised in a medium-sized company, Makita Manufacturing Europe Ltd. (MME), where there is no translation department.
The author chose test sets from a Japanese organisation, Japan Electronic Industry Development Association (JEIDA) [Isahara, 95]. However, this experiment focuses only on the manufacturing domain, where the language can be a sublanguage. As can be seen from the example below, test sets are not particularly relevant to the evaluation of sentences from the manufacturing domain.
Thus, instead of using actual sentences from JEIDA's test sets for the evaluation, the author used the test sets' categories for a CMT to translate actual texts from MME. In this way, the result of the evaluation offers immediate feedback to manufacturing companies that require translation services. The experiment procedure was as follows. Firstly, more than five hundred Japanese sentences were chosen from the typical types of document held by Makita Manufacturing Europe Ltd.(MME ), and then these were translated using the Commercial Machine Translation software, HAII. Secondly, an evaluation was carried out using criteria from JEIDA's test sets. In this paper, linguistic features of both English and Japanese are discussed from the evaluation experiment in order to make up writing rules for members of staff at Makita Manufacturing Europe. Accordingly, recommendations are made concerning how a company seeking globalization could resolve communication problems by the employment of Commercial Machine Translation.
2.. General Background of Commercial Machine Translation Products (Japanese/English) on the Market.
The availability of CMT for non-professional translators has increased tremendously over the last five years. Ten years ago, MT systems in Japan, like MT systems in other countries, were only run on large scale main frame computers or on UNIX workstation. The cost of the system was in the region of £20k and the market being targeted consisted of professional translators and translation agencies [Johnson 97]. Hence target customers were organisations where the volume of translation required was high, so the MT system is paid off.
Now, many MT packages cost under 9,800 yen (approximately 100 US dollars) in the market. This is an incredibly low price compared to the price of MT software packages being sold in other countries. For example, SYSTRAN sells English to Japanese and Japanese to English MT software for 995 US dollars, and the same product is sold at 128,000 yen (approximately 1,000 US dollars) on the Japanese market. According to Ian Hutchins the quality of affordable MT in the European market is not as high as Japanese counterparts [Hutchins, 99].
Owing to the fact that any customer can pick up a CMT at any software shop in Japan, a user's guide is the most valuable information source for users, partly because it is also the only source of information. However, very few user's guides give a thorough explanation about the weaknesses of CMT and information about how to make the most of it For instance, it asks users to cut long sentences into shorter sentences and to add subjects and objects if they are omitted. This may sound helpful, but unless an explanation is provided as to what the subject and object are in a sentence, this type of guidance is not practical at all.
If a user is an MT researcher, such as an author who understands the weaknesses and strengths of the MT, this single guidance page may suffice. Yet most users are not MT specialists. They require more information and an explanation in order to comprehend MT capabilities. The developer should remember that users of MT are not necessarily well educated or well informed in the field of linguistics. The lack of quality in user's manuals creates disappointment among CMT users who have high expectations the product when purchasing. This report will enhance the use of MT in the manufacturing industry.
3.. HAII dictionary and Creating a User's Dictionary
Creating a user's dictionary is a vital part of the customisation of MT. The dictionary of HAII stores 100,000 words. A user's dictionary can be created in order to improve the quality of translation. NEC, a developer of HAII, regularly updates the dictionary for newly coined words and offers it to customers over the internet. Also, user's dictionaries are shared and the company makes them available to other users on the internet. The company is hoping to offer a facility which enables their users to share the user's dictionary from the MT products of their competitors [Kamei et al 97]. Yet it is not sufficient and the current situation requires the building of a user's dictionary.
It is easy to say that you can create your own dictionary and that you will then receive a high quality translation. However, creating a user's dictionary is not as simple as it may look. First of all the creator must possess knowledge about basic Japanese and English grammar in order to build a useful dictionary.
For example, if the user wants to enter a Japanese verb, he/she ought to know the type of verb: There are eight types of verbs in Japanese depending on how they conjugate. Therefore, it is essential for the user to acquire basic Japanese grammar as well as English grammar. It is vital for him/her to know a part of speech in both Japanese and English, in order to produce a user's dictionary. So how can this be expected of a user who does not know much about English grammar? Probably he/she would claim that this is the reason why he/she bought this MT package. If they do, they would not need to use MT. Guidance is therefore needed for a user to create a user's dictionary. Also it has to be pointed out that native Japanese engineers do not necessarily have sufficient knowledge of Japanese grammar.
Furthermore the survey shows that the use of a sublanguage within a company is so significant that it is not straightforward or realistic to impose the same terminology among different organizations [Nikkei 28 May 99].
4.. Industrial Background of Japanese Companies
This section briefly explains the background of the globalization of the Japanese manufacturing industry, and also the background of the Makita Manufacturing Europe (MME) where the author conducted the MT evaluation experiment.
MME started its operation in the early 1990's in Telford. It is a power tool manufacturer for professionals, and they decided to produce power tools for the European market. They brought machines and equipment from Japan to set up production lines, and training has been carried out using manuals from Japan. When the necessity for translation of these documents was identified, they decided to hire an in-house translator. There are more than 25 Japanese manufacturing companies like MME in Telford.
In most cases the Japanese companies try to use English for their communication. Yet it is not easy for Japanese employees to compose documents in English due to the fact that communication skills were not widely taught in the Japanese education system until recently.
5.. Evaluation of Performance
Since its development MT evaluation has been discussed extensively, yet there is not one standard evaluation method due to the fact that MT evaluation has different requirements for the different MT stakeholders such as translators, information consumers, managers, researchers and novice users. On top of this, sometimes there is more than one translation example for a particular sentence from the source text [White 98, Jones 98]. It is possible to carry out evaluation of a CMT either from a user's point of view or a developer's point of view. There are more than a few research groups which have conducted research on MT evaluation in Japan [Nagao 96, Ikehara et al 94, Tomita 92, Ikehara et al 92] as well as in other countries [White and Taylor 98, Povlsen et al 98, Bech 97]. This paper does not discuss the evaluation itself but focuses on linguistic features identified in the evaluation experiment.
In the following sections linguistic features of the original text from Makita Japan are analysed in order to make up writing rules for members of staff at Makita Manufacturing Europe.
5.1.. Result of the Experiment from HAII and MME Language
This section presents the investigation on the CMT experiment done at Makita Manufacturing Europe (MME). The author worked for the company as an in-house translator from October 1996 to November 1997.
5.1.1.. MME vocabulary word selection
Each company may use different terminology when they mean the same object or act. MME is no exception. There are a few words that are used at MME which cannot be found in a technical dictionary. Therefore the author needed to input the MME vocabulary into the dictionary of Honyaku Adapter II (HAII). For example, as MME manufactures power tools, a few electrical tests are required, and 'taiatsu shiken' is one of them. The word is found in a technical dictionary as 'endurance test', however, MME uses the English terminology, 'withstanding test' as in the following table.
Table 1:
Japanese (MJ document) |
Dictionary (English translation) |
MME Term (English) |
Taiatsu shiken |
endurance test |
Withstanding test |
Moreover, engineers at Makita Japan (MJ) sometimes write manuals with regional dialect. MT cannot process some expressions that are only used in a certain area in Japan.
[n: noun p: particle v: verb]
e.g.
Sagyosha n operators |
ga p |
boranai v not wait for |
|
| |
You ni p p |
ki n. to |
wo p |
tsukenu v care take |
(Translation by a human translator: Care should be taken for not letting operators wait for the material.)
The second example is always used in an operation manual. It is grammatically wrong in Japanese because it uses the wrong particle 'ni' (meaning 'at'), but it should use 'wo' (object marker) instead.
e.g.
Hako n box |
na p in |
narabe v line up |
| (Place the carton on the line.) |
The last word 'narabe' also has a missing element. It misses out the ending 'ru'. 'Narabe' is the stem of the verb 'naraberu'. If the sentence has to be rewritten in formal Japanese, it will be 'Hako wo naraberu'.
We can easily change the definition of a word in a dictionary of HAII. Therefore, the author changed the definition of a word when necessary. For example, HAII translated the word 'okyaku san' into 'a guest'. This word can be translated into a few Japanese words, but 'a customer' is more appropriate for an MME document.
When HAII comes across with a new word, it leaves it out. Consequently the translated sentence includes odd Japanese word in Japanese characters. Since Makita uses special terminology, whenever this case arises, it has to be post-edited and the word registered in the user's dictionary. Here are some examples:
Table 2:
Japanese (MJ document) |
Translated by HAII |
MME Term (English) |
| Shijisho |
Indication book |
Instruction manual |
Shimetsuke torque |
Fastening-up torque |
Tightening torque |
5.1.2.(1).. Long Sentences
Inputting short sentences for MT is a typical instruction written in the user's guide when we buy an MT system. However, HAII could manage to translate rather long sentences without any pre-editing as in the following example.
[n=noun p=particle v=verb adj=adjective adv=adverb]
e.g.
'Akafuda-sakusen n red card strategy |
to p as |
wa p for |
|
akai adj red |
fuda n tag |
wo p |
tsukatte v attach |
kojo n factory |
ni p in |
habikotteiru v be rampant |
|
aka n dirt |
wo p |
|
|
dare n everyone |
ni p to |
demo p even |
|
wakaru v understand |
yo p in |
ni p order |
suru v to do |
seiri n sorting out |
no p of |
yarikata n method |
desu p it is |
HAII produced the following translation for the above Japanese sentence in a document. 'The red card strategy is a way of the arrangement to make anyone understand the dirt which is being rampant at the factory using the red card.'
Although the translation was not very natural, it is understandable by British engineers and team leaders in a factory. It has to be noted that there are always about ten Japanese members of staff at MME, and they can explain if British members are not sure of the meaning of the translation. However, the experiment shows that a sentence which does not have a subject and object and has too many embedding phrases is not to be translated satisfactorily.
5.1.2 (2) Changes Have to Be Given in Original Sentences
Since Japanese engineers never do a formal technical writing course they often enter sentences which are ambiguous. Also, the fact that a subject and object can be omitted in a Japanese sentence allows them to write more ambiguous sentences.
For example, there are many sentences without subjects and objects in Makita document, which need pre-edition in order to be processed by MT, as in sentence 87:
e.g.
Kojo n factory |
dewa p p in |
okakusan n customers |
ni p by |
yorokonde v feel happy |
itadekeru p (polite ending) |
shinamono n products |
|
wo p |
tukuru v make |
tameni, p p in order to |
|
mainichi n everyday |
isshoukenmei adv. hard |
desu p |
|
The above sentence omits the subject of the sentence, watashi tachi (we).
Therefore, the translated sentence generated by HAII did not convey the correct meaning. After adding the subject of the sentence, the MT produced the acceptable translation as follows:
'At the factory, to make the product which it is possible for the customer to be glad about, we are strenuous every day.'
5.1.2.(3) Imperative Sentences
Instruction manuals are full of imperative sentences. '-Suru koto' is the phrase used as a command form. 'Koto' is a nominalizer attached after a declarative statement. However, 'koto' also has a different meaning, 'thing' in Japanese. HAII translated 'koto' into the latter meaning, therefore the pre-editor should have rewritten the whole sentence using the imperative form. (Te-form of verb + kudasai)
5.1.2 (4) Phrasal Verbs and Idiomatic Expression
In case of translating idiomatic expressions, HAII has a disadvantage. For example, in sentence 11, HAII broke up the idiomatic expression into individual words and translated each word separately. Therefore, the sentence is not meaningful and needs to be post-edited. For example:
e.g.
ato n back |
wo p not |
tatanai v cut off |
HAII translation is 'doesn't cut off the back'. HAII could not recognise the phrasal verb 'ato wo tatanai' (never stop), so it translated word by word.
5.1.2(5) Problems of Double Subjects
In most languages, a simple sentence has only one subject (nominative case), whereas in Japanese, many adjectives and some verbs can dominate two surface subject cases within a simple sentence. This is called double-subject construction which confuses MT when analysing sentences [Oku 96] .
6 Suggestions in Document Preparation (Writing Rules)
The quality of input sentences affects the quality of translation, so staff at MME should be provided with adequate information on this subject. As we know, natural language is dynamic and flexible, and ambiguity is inevitable. We must emphasise this point strongly to CMT users, and educate them how to utilise CMT in their organisations. CMT is too good to be ignored or abandoned, though it does not have human intelligence. Customers will be disappointed when a product fails to meet expectation. Therefore, adequate PR is needed for penetrating the market in order to enhance MT user awareness. The following are basic writing rules for MME [see Appendix].
7 Summary of the Report
The result of the study can be utilised for enhancing communication within a small to medium sized organisation such as MME. As the globalization of Japanese manufacturers becomes common, problems caused by lack of communication can seriously affect management. MT systems are certainly not magic wands which resolve all the communication problems occurring at MME, but they will without doubt help enhance communication between British and Japanese staff if we all know how to utilise them. They will help decrease the time and cost of rough translation. The point is how we can make the most of the existing system.
Japanese managers tend to think that MT is for Japanese staff only. However, it is also a useful tool for British staff. Viewpoints of British staff have rarely been discussed by researchers. British employees working at Japanese factories are used to listening to English spoken by Japanese staff. Translated documents can be understood even if they are not flawless sentences. As long as British employees see English texts, they are more motivated. If there is a substantial amount of Japanese documents left without translation in the company, British employees feel they are not in a team. They suspect that any hidden information is in a Japanese document. The problem of this psychological barrier between Japanese and British employees within the Japanese manufacturing companies overseas is often identified and it should not be ignored if the productivity of the factory is to be enhanced. Therefore, it is also crucial for the management of any Japanese manufacturing company to make sure that British engineers can also have access to an MT system when they introduce it.
The introduction of the new system may look exciting and promising in a company, but we must make sure that we use it as a part of our work. The author recommends that MME should appoint one person who is responsible for MT and keep updating the system constantly. He/she must report regularly on how it is utilised within the company. In this way, MT systems can find their place in a company. Otherwise, the novelty will soon wear off and no one will bother to switch it on.
References
- Hutchins, W. J. and Somers,.H. L. An Introduction to Machine Translation, Academic Press, UK, 1992
- Ikehara, S., Shirai, S., and Ogura, K. Criteria for evaluating the linguistic quality of Japanese to English machine translations, Journal for Japanese Society for Artificial Intelligence. Vol.9 1994
- Isahara, H. JEIDA:s Test-Sets for Quality Evaluation of MT Systems - Technical Evaluation from the Developer's Point of View --, Proc. Of MT Summit V, Luxembourg, July 19 - 22, 1995
- Johnson, I., Personal Translation Applications, Proc. Of ASLIB 97 Translating and the Computer 19, November 13 and 14, 1997, London
- Jones, I., A translation service to maximize quality and efficiency, Proc. Of ASLIB 98 Translating and the Computer 20, November 12 and 13, 1997, London
- Kamei, S., Itoh, E., Fujii, M., Hirai, T., Saitoh, Y., Takahashi, M., Hiyama, T., Muraki, K., Sharable Formats and Their Supporting Environments for Exchanging User Dictionaries Among Different MT Systems as a Part of AAMT Activities, Proc. Machine Translation Summit VI, pp132-141, 1997, San Diego, USA
- Tatakau Nihongo, (Fighting Japanese), p.1, 28 May 1999, Nikkei Shinbun Newspaper, Tokyo, Japan
- Nagao, M.(Editor and author), Sato, S., Kurohashi, S., Tsunoda, T., Shizen Gengo Shori (Natural Language Processing) in Japanese 1996 Iwanami Shoten Publishing Company, Tokyo, Japan
- Oku, M., Analysing Japanese Double-Subject Construction having an Adjective Predicate, COLING 96 Proc. Pp. 865-870 1996]
- Povlsen, C, Underwood, N., Music, B., Neville, A. Evaluating Text-type Suitability for Machine Translation a Case Study on an English-Danish MT System, Proc. Of the First Internatinal Conference on Language Resources and Evaluation, Granada pp.27-31, 28-30 May, 1998, Spain,
- Shirai, S., Yokoi, A., Okuyama, N., Kawamura, M., Ikehara, S., The Support System for end users to create dictionary of English-Japanese Combining Pattern, Proc. Of the Fourth Annual Meeting of the Association for Natural Language Processing in Japan pp.568-571, Kyushu, Japan, March 23-26,1998
- White, John, MT Evaluation, Tutorial given at MT Summit VI, San Diego: Association for Machine Translation in the Americas, 29 Oct.-1 Nov. 1997, Sandiago, USA,
APPENDIX [Rules for Users of HAII]
Here are basic rules for members of staff at MME to make most of the commercial machine translation system, HAII.
- Input a grammatical sentence.
(Do not omit the subject or object of a sentence, or do not input a fragment of a sentence.)
- Input the correct kanji.
(A person who uses HAII in the company should check whether he/she is inputting the correct kanji or not.)
- Avoid using a phrasal verb.
- Use a hiragana or kanji character, and avoid using katakana for nouns which are not borrowed words from foreign languages.
(HAII sometimes does not recognise a word written in katakana character.)
- Avoid using a comma, but use a word 'to' (and).
- Avoid using kanji for the verb 'okonau' because it tends to mistranslate.
- Avoid a sentence with embedded clauses.
(Make it simple and split up into more than one sentence.)
- Specify singular or plural if necessary.
(HAII treats a noun as a singular form.)
- Avoid using 'koto' or dictionary form when making an imperative sentence.
(Although it is a very typical way of making up an imperative form in Japanese, HAII has difficulty in translating these types of imperative sentences.)
- Use a sentence from example translation database of HAII when translating the opening and closing part of a business letter.
(Set phrases from typical Japanese business letters should be translated as a whole sentence, but not word by word.)
- Avoid using a word which has more than one meaning.
(It is better to use the consistent terminology, so review the definition of well-used words within the company.)
- Input Makita terminology.
(Most terminology is not in the HAII dictionary, therefore it is necessary to input them in the dictionary.)
- Remember that HAII dictionary only carries one definition for one word.
(It takes a while until the system is tuned. Since each entry of a lexicon from HAII dictionary cannot store more than one meaning, it is advisable to use an entry in only a single domain such as a translating operation manual.)
|