Effects of Automatic Rewriting of the Source Language within a Japanese to English MT System

SATOSHI SHIRAI,+ SATORU IKEHARA,+ TSUKASA KAWAOKA+ and YUKIHIRO NAKAMURA++


To Improve the quality of machine translation, it is important to develop a translation method that takes into account the conceptual differences between languages that cause difficult problems in translation. Up to now, many machine translation methods such as Multi-Level Translation method and Example Based Translation method have beeen proposed. The conceptual differences between languages typically occur with expressions that must be subjected to manual pre-editing of the source texts. Then, if difficult-to-translate expressions can be automatically pre-edited into easy-to-translate expressions, these problems will be considerably solved. But, it has been difficult because of the problem of the same structure with different meanings. This paper proposes a translation method that includes an automatic source text rewriting function based on the considerations of the following two points. First, if we can use both precise syntactic attributes and semantic attributes, applying conditions of rewriting rules can correctly be defined. Second, if rewriting rules are applied to intermediate expressions after syntactic analysis, most of undesired effect can be avoided because sufficient information for application of the rule can be obtained. This method has the advantage of being able to use existing translation functions for the translation of difficult-to-translate expressions. At the same time, it improves processing efficiency by reducing ambiguities in syntactic and semantic analysis. According to translation experiments using newspaper articles, rewriting rules were applied to 44 sentences (43%) out of a total 102 sentences (in 32 articles), an aggregate total of 52 locations. Translation quality was improved in 33 sentences (75%) of the total and there was no degradation in the remainder. Furthermore, ambiguities in the semantic analysis were reduced from an average 5.39 per sentence to 1.31 per sentencc. These results show that this simple method gives a substantial improvement in translation quality.


[ Transaction of Information Processing Society of Japan, pp.12-21 (January, 1995). ]





+ NTT Communication Science Laboratories
++ NTT Information and Communication Systems Laboratories