Introduction to "Goi-Taikei: A Japanese Lexicon"

Satoshi SHIRAI*2 *1, Yoshifumi OOYAMA*1, Satoru IKEHARA*3, Masahiro MIYAZAKI*4, and Akio YOKOO*2

*1 NTT Communication Science Laboratories,
*2 ATR Interpreting Telecommunications Research Laboratories,
*3 Faculty of Engineering, Tottori University, and
*4 Faculty of EngIneering, Niigata University


Abstract

In order to improve the quality of Japanese-to-English machine translation, we have developed a Japanese-to-English machine translation system along with the semantic dictionaries it requires. The semantic dictionaries consist of a semantic attribute system, a semantic word dictionary and a semantic valency dictionary. The semantic attribute system gives a hierarchy of 3,000 attributes which are used to describe different characteristics of common nouns, proper nouns and predicates.

The semantic word dictionary consists of 400,O00 words, including spelling variations and some 200,000 proper nouns. Words are marked with syntactic and semantic information.

The semantic valency dictionary covers 6,000 Japanese verbs, adjectives and nouns, divided into 16,000 Japanese and English pattern pairs for 6,000. The patterns are marked with verbal semantic attributes, and constraints are given on their arguments with the common noun semantic attributes.

In this paper, we describe how these dictionaries were developed, and give an outline of the published subset: "Goi-Taikei: A Japanese Lexicon".



[ IPSJ SIG Notes, 98-IM-34-9, pp.47-52 (November, 1998). ]