The Quantity of Valency Pattern Pairs Required for Japanese to English MT and Their Compilation

Satoshi Shirai+, Satoru Ikehara+, Akio Yokoo+ and Hiroko Inoue++


+NTT Communication Science Laboratories++NTT Advanced Technology Corporation
Take 1-2356, Yokosuka, 238-03, JapanKawakami-cho 90-6, Totsuka-ku, Yokohama, 244, Japan
{shirai,ikehara,ayokoo}@nttkb.ntt.jpinoue@totsuka.ntt-at.co.jp


Abstract

In order to realize the valency pattern method, which is used in the semantic analysis of co-occurrence of verbs and nouns, this paper discusses how many pattern pairs should be prepared and the method of collectively gathering these patterns. A pattern pair preparation method is proposed that combines existing knowledge compiled in dictionaries for human use with examples prepared manually by relying on personal knowledge.

Specifically, three methods are examined. The results show that Japanese to English machine translation requires about 7,500 pattern pairs to cover the 1,000 Japanese origin verbs that are critical to differentiated translation. Preparing this number of pairs requires the collection of 15,000 examples. It is also predicted that about 25,000 pattern pairs would be required to cover all Japanese predicates including verbs of Chinese origin and idiomatic expressions of declinable word type. Furthermore, the method of preparing examples through human knowledge is shown to be entirely feasible.



Keywords

valency pattern, semantic analysis, Japanese to English machine translation, bilingual corpus



[ IPSJ SIG Notes, 95-NL-110-7, pp.43-50 (November, 1995). ]