µ¡³£ËÝÂôÅù¤Î¼«Á³¸À¸ì½èÍý¤ËɬÍפÊ, »ÈÍÑÉÑÅ٤ι⤤ɽ¸½¤äÌÌÄêŪ¤Ê¸À¤¤²ó¤·¤Ê¤É¤Îɽ¸½¤òÃê½Ð¤¹¤ë¤¿¤á, ÂçÎ̤θÀ¸ì¥Ç¡¼¥¿¤òÂоݤË, Ï¢º¿·¿¤ª¤è¤ÓÎ¥»¶·¿¤Î¶¦µ¯É½¸½¤ò¸úΨ¤è¤¯¼«Æ°Åª¤ËÃê½Ð¤¹¤ë¥¢¥ë¥´¥ê¥º¥à¤òÄó°Æ¤·¤¿. Ï¢º¿·¿¶¦µ¯É½¸½¤ÎÍ®½Ð¤Ç¤Ï, ºÇ¶áÄó°Æ¤µ¤ì¤¿n-gramÅý·×¤ÎÊýË¡¤¬»ÈÍѤǤ¤ë¤¬, ËÄÂç¤ÊÎ̤ÎÃÇÊÒŪ¤Êʸ»úÎó¤¬Ãê½Ð¤µ¤ì¤ë¤¿¤á, ¤½¤Î¹Ê¤ê¹þ¤ß¤¬ÌäÂê¤Ç¤¢¤Ã¤¿. ¤Þ¤¿, Î¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½Ð¤Ç¤Ï, ŬÀÚ¤ÊÊýË¡¤¬¤Ê¤«¤Ã¤¿. ¤½¤³¤Ç, ËÜÏÀʸ¤Ç¤Ï, ¤Þ¤º, Ï¢º¿·¿¶¦µ¯É½¸½¤ËÂФ·¤Æ, ÃÇÊÒŪ¤Êʸ»úÎó¤ÎÃê½Ð¤òÂçÉý¤ËÍÞÀ©¤·¤Ê¤¬¤é, Ǥ°Õ¤ÎŤµ°Ê¾å¤Ç, Ǥ°Õ¤Î½Ð¸½²ó¿ô°Ê¾å¤Îʸ»úÎó¤òÃê½Ð¤¹¤ë¥¢¥ë¥´¥ê¥º¥à¤òÄó°Æ¤·¤¿. ¼¡¤Ë, ¤³¤ì¤Ë¤è¤Ã¤ÆÆÀ¤é¤ì¤¿Ï¢º¿·¿¤Î¶¦µ¯É½¸½¤òÁȤ߹ç¤ï¤»¤Æ, Î¥»¶·¿¤Î¶¦µ¯É½¸½¤ò¼«Æ°Åª¤Ëϳ¤ì¤Ê¤¯Ãê½Ð¤¹¤ëÊýË¡¤òÄó°Æ¤·¤¿. 3¥«·îʬ¤Î¿·Ê¹µ»ö¥Ç¡¼¥¿(892Ëü»ú)¤òÂоݤȤ·¤¿¼Â¸³¤ÎÎã¤Ë¤è¤ì¤Ð, Ï¢º¿·¿¶¦µ¯É½¸½¤Î¾ì¹ç, ʸ»úÎóĹ2 ʸ»ú°Ê¾å, ½Ð¸½ÉÑÅÙ2²ó°Ê¾å¤ÇÃê½Ð¤µ¤ì¤ëɽ¸½¤Î¼ïÎà¤Ï, n-gram¤ÎÊýË¡¤Ç¤Ï, 440Ëü¼ïÎà(±ä¤Ù½Ð¸½²ó¿ô3,120Ëü²ó)¤Ç¤¢¤Ã¤¿¤Î¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, 97Ëü¼ïÎà(±ä¤Ù½Ð¸½²ó¿ô260Ëü²ó)¤È¤Ê¤ê, ÃÇÊÒŪ¤Êɽ¸½¤ÏÂçÉý¤Ë¸º¾¯¤·¤¿. ¤Þ¤¿, ¿·¤¿¤ËÄó°Æ¤·¤¿Î¥»¶·¿¶¦µ¯É½¸½Ãê½ÐÊý¼°¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯¤ÎÃê½Ð¤ÇÆÀ¤é¤ì¤¿Ê¸»úÎó¤Î¤¦¤Á, 10²ó°Ê¾å½Ð¸½¤·¤¿Ê¸»úÎó(12,350¼ïÎà)¤ÎǤ°Õ¤Î2¼ïÎब, 1ʸÃæ¤Ë2²ó°Ê¾å¶¦µ¯¤·¤¿É½¸½¤ÎÁȤÏ, 6,500¼ïÎà(±ä¤Ù½Ð¸½²ó¿ô21,800²ó)¤Ç¤¢¤ë¤³¤È¤Ê¤É, Íưפ˵á¤á¤ë¤³¤È¤¬¤Ç¤¤¿.
In order to extract rigid expressions with a high frequency of use, new algorithms that can efficiently extract both uninterrupted and interrupted collocations from very large Japanese corpora have been proposed. More recently, the technique of applying n-gram statistics for uninterrupted collocation has been proposed. This enables the extraction of collocations in the order of string length and frequency of use. But this metbod posed problems in that large volumes of fractional and unnecessary expressions are included. To solve this problem, this paper proposes a new algorithm that restrains the extraction of unnecessary expressions. This is followed by the proposal of a method that extracts interrupted collocations combining the uninterrupted collocations thus obtained. These new methods are applied to newspaper articles containing 8.92 million characters. In the case of uninterrupted collocations with string length of 2 or more characters and whose frequency of appearance is 2 or more times, there were 4,4 million expressions (total frequency or 31.2 million times) extracted by the conventional method. In contrast, the new method reduced this to 0.97 million types (total frequency of 2.6 million times) revealing a substantial reduction in fractional and unnecessary expressions. In the case of interrupted collocational substring extractions, combining the substring with frequency of 10 times or more extracted by the first method, yielded 6.5 thousand types of pairs of substrings with the total frequency of 21.S thousand times.
ºÇ¶á, ¼«Á³¸À¸ì½èÍý¤Ë¤ª¤¤¤Æ, ÂçÎ̤Υ³¡¼¥Ñ¥¹¤äÍÑÎã¤Î½ÅÍ×À¤¬»ØŦ¤µ¤ì, ¤½¤ì¤òʬÀϤ¹¤ëµ»½Ñ¤ÎɬÍ×À¤¬ÁýÂ礷¤Æ¤¤¤ë. Î㤨¤Ð, µ¡³£ËÝÌõ¤Ç¤Ï, ñ¸ìñ°Ì¤ÎľÌõ¤Ç¤Ï¤¦¤Þ¤¯Ìõ¤»¤Ê¤¤¥Õ¥ì¡¼¥º¤ò½¸¤á, ¥Õ¥ì¡¼¥Öñ°Ì¤ËËÝÌõ¤¹¤ëÊýË¡¤ä, °ìÄê¤Î¹½Â¤¤ò»ý¤Äɽ¸½¤òÂÐÌõ¥Ñ¥¿¡¼¥ó²½¤·, ¥Ñ¥¿¡¼¥ó¼½ñ¤Ë¤è¤Ã¤Æ¸¶¸À¸ì¤òÌÜŪ¸À¸ì¤ËÂбþ¤Å¤±¤ëÊýË¡¤Ê¤É¤¬¹Í¤¨¤é¤ì¤Æ¤¤¤ë. ¤³¤ì¤é¤ÎÊýË¡¤ò¼Â¸½¤¹¤ë¤Ë¤Ï, ¸½¼Â¤Ë»ÈÍѤµ¤ì¤Æ¤¤¤ë¸À¸ì¥Ç¡¼¥¿¤ÎÃ椫¤é, »ÈÍÑÉÑÅ٤ι⤤¥Õ¥ì¡¼¥º¤äɽ¸½¤Î¥Ñ¥¿¡¼¥ó¤òÃê½Ð¤¹¤ë¤³¤È¤¬É¬ÍפǤ¢¤ë.
¤·¤«¤·, ËÄÂç¤Ê¸À¸ì¥Ç¡¼¥¿¤òÂоݤȤ¹¤ë¤È¤, Ǥ°Õ¤ÎŤµ¤Ç, ½Ð¸½ÉÑÅ٤ι⤤ɽ¸½Ê¸»úÎó¤òϳ¤ì¤Ê¤¯¼«Æ°Åª¤Ëȯ¸«¤·¤Æ, Ãê½Ð¤¹¤ë¤³¤È¤Ï, ·×»»Î̤ÎÅÀ¤Çº¤Æñ¤Ç¤¢¤Ã¤¿. ¤½¤Î¤¿¤á, ½¾Íè, ¼«Á³¸À¸ì¤È¤·¤Æ¤ÎÆÃħ¤ËÃåÌܤ¹¤ëÊýË¡, Ãê½Ð¤¹¤ëʸ»úÎó¤ÎÀ¼Á¤ËÃåÌܤ¹¤ëÊýË¡¤Ê¤É, ÌÜŪ¤Ë¹çÃפ¹¤ëʸ»úÎó¤ò¸ÂÄêŪ¤ËÃê½Ð¤¹¤ëÊýË¡¤¬¹Í¤¨¤é¤ì¤Æ¤¤¿. Î㤨¤Ð, Á°¼Ô¤ÎÊýË¡¤È¤·¤Æ¤Ï, ¸À¸ì¥Ç¡¼¥¿¤«¤é·ë¤Ó¤Ä¤¤Î¶¯¤¤Ã±¸ì¤ò¼è¤ê½Ð¤¹´ÑÅÀ¤«¤é, 2ñ¸ì¤Î·ë¤Ó¤Ä¤¤Î¶¯ÅÙ¤ËÃåÌܤ·¤¿ÊýË¡1), ñ¸ì´Ö¤Îµ÷Î¥¤ËÃåÌܤ·¤¿ÊýË¡2), ·ë¹çñ¸ì¿ô¤È½Ð¸½²ó¿ô¤ò¹Íθ¤·¤¿ ÊýË¡3),4)¤Ê¤É¤¬Äó°Æ¤µ¤ì¤Æ¤¤¤ë. ¸å¼Ô¤ÎÊýË¡¤È¤·¤Æ¤Ï, Ãê½Ð¤¹¤ëñ¸ì¤äʸ»ú¤ÎÏ¢º¿¤Î¿ô¤òÀ©¸Â¤·¤¿¤ê, û¤¤Ï¢º¿¤Ç½Ð¸½ÉÑÅ٤ι⤤¤â¤Î¤ËÃåÌܤ·¤Æ, ¸ÂÄꤵ¤ì¤¿Ê¸»úÎó(ñ¸ìÎó)¤ÎÈϰϤÇÏ¢º¿¿ô¤òÁý¤ä¤·¤Æ½¸·×¤¹¤ë ÊýË¡5)¤Ê¤É¤¬¹Í¤¨¤é¤ì¤Æ¤¤¤¿.
¤³¤ì¤ËÂФ·¤Æ, ºÇ¶á, ÂçÎ̤θÀ¸ì¥Ç¡¼¥¿¤òÂоݤË, Ǥ°Õ¤În¤ËÂФ¹¤ën-gramÅý·×¤ò¹â®¤Ë¼Â¹Ô¤¹¤ëÊýË¡¤¬Äó°Æ¤µ¤ì6), ¸À¸ì¥Ç¡¼¥¿Æâ¤Ë¤¢¤ëǤ°Õ¤ÎŤµ¤Îʸ»úÎó(°ìÈ̤ˤϵ¹æÎó)¤ò¼«Æ°Åª¤ËÃê½Ð¤·, ¤½¤Î½Ð¸½²ó¿ô¤ò¥«¥¦¥ó¥È¤¹¤ë¤³¤È¤¬²Äǽ¤È¤Ê¤Ã¤¿. ¤³¤Î·ë²Ì¤òÍѤ¤¤ì¤Ð, ¸¶Ê¸Ãæ¤Ë»ÈÍѤµ¤ì¤¿Ê¸»úÎó¤ò, ¤½¤ÎŤµ(ʸ»ú¿ô)¤Î½ç¤«¤Ä½Ð¸½·¹Å٤ι⤤½ç¤Ë½¸·×¤¹¤ë¤³¤È¤¬¤Ç¤¤ë. ¤·¤«¤·, ¤³¤ÎÊýË¡¤Ç¤Ï, Ãê½Ð¤¹¤ëʸ»úÎó´Ö¤ÎÁê¸ß´Ø·¸¤¬Ìµ»ë¤µ¤ì¤Æ¤¤¤ë¤¿¤á, ´û¤ËÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÉôʬʸ»úÎ󤬽ÅÊ£¤·¤ÆÃê½Ð¤µ¤ì¤ë. ¤·¤¿¤¬¤Ã¤Æ, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ò¸À¸ìɽ¸½¤È¤·¤Æ¸«¤¿¾ì¹ç, ʸˡŪ, °Ọ̃Ū¤Ë¤Þ¤È¤Þ¤ê¤Î¤Ê¤¤ÃÇÊÒŪ¤Êʸ»úÎó¤¬Â¿¿ô¤òÀê¤á¤ë. ¤³¤ì¤ò°ÕÌ£¤Î¤¢¤ëʸ»úÎó¤Ë¹Ê¤ê¹þ¤àÊýË¡¤È¤·¤Æ, Ʊ°ì¤ÎÏÀʸ6)¤Ç¤Ï, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤È¤½¤Î½Ð¸½²ó¿ô¤òÁê¸ß¤ËÁȤ߹ç¤ï¤»¤ëÊýË¡¤Î²ÄǽÀ¤ò¼¨¤·¤Æ¤¤¤ë. ¤½¤Î¸å, ¤³¤În-gramÅý·×¥Ç¡¼¥¿¤È¤·¤ÆÆÀ¤é¤ì¤¿Ê¸»úÎ󤫤é°ÕÌ£¤Î¤¢¤ëɽ¸½¤ò¼è¤ê½Ð¤¹ÊýË¡¤È¤·¤Æ, Ãê½Ð¤·¤¿Ê¸»úÎó¤Î¥¨¥ó¥È¥í¥Ô¡¼´ð½à¤òÍѤ¤¤ë ÊýË¡7)¤¬Äó°Æ¤µ¤ì¤Æ¤¤¤ë. ¤Þ¤¿, n-gramÅý·×¤ò±þÍѤ·¤¿¤â¤Î¤Ë, ½õ»ìŪÄ귿ɽ¸½¤ÎÃê½Ð¤Î Îã8)¤¬¤¢¤ë¤¬, ¤³¤ÎÊýË¡¤Ç¤Ï, ¤¢¤é¤«¤¸¤á, Ãê½Ð¤¹¤ëʸ»úÎó¤ò¹½À®¤¹¤ë»ú¼ï¤ÎÁȤò¸ÂÄꤹ¤ë¤³¤È¤Çn-gram¤Î·×»»Î̤ÎÌäÂê¤ò²óÈò¤·, ¤½¤Î¸å, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ò¼ï¡¹¤Î¥Ò¥å¡¼¥ê¥¹¥Æ¥£¥Ã¥¯¥¹¤òÍѤ¤¤Æ¹Ê¤ê¹þ¤ó¤Ç¤¤¤ë.
¼¡¤Ë, Î¥¤ì¤¿°ÌÃ֤˶¦µ¯¤¹¤ëɽ¸½¤ÎÁȤÎÃê½Ð¤ò¸«¤ë¤È, Ê£¿ô¤Îʸ»úÎó¤òÁȤ߹ç¤ï¤»¤Æ, ¸¶Ê¸Ãæ¤Ç¤Î¶¦µ¯¤òÄ´¤Ù¤ë¤³¤È¤¬É¬ÍפǤ¢¤ë. n-gramÅý·×¤Ç¤Ï, ËÄÂç¤ÊÎ̤Îʸ»úÎó¤¬Ãê½Ð¤µ¤ì¤ë¤¿¤á, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎ󤹤٤ƤòÁȤ߹ç¤ï¤»¤Æ¸¶Ê¸¤ò¥µ¡¼¥Á¤¹¤ë¤Î¤ÏʪÍýŪ¤Ë°øÆñ¤Ç¤¢¤Ã¤¿. Ï¢º¿·¿, Î¥»¶·¿¤òÆä˶èÊ̤»¤º, 1ʸÃæ¤Ë¶¦µ¯É½¸½¤¬Àê¤á¤ë³ä¹ç¤Î¿¤¤Ê¸¤òÄ귿Ū¤Êʸ¤È¤·¤ÆÃê½Ð¤¹¤ë »î¤ß9),10)¤â¤¢¤ë¤¬, ÂçÎ̤θÀ¸ì¥Ç¡¼¥¿¤ÎÃ椫¤é, ½Ð¸½ÉÑÅ٤ι⤤ʸ»úÎó¤ÎÁȤò, ϳ¤ì¤Ê¤¯¼«Æ°Åª¤Ëȯ¸«¤·½¸·×¤¹¤ë¤Î¤Ë¸ú²ÌŪ¤ÊÊýË¡¤ÏÃΤé¤ì¤Æ¤¤¤Ê¤¤.
¤È¤³¤í¤Ç, ÂçÎ̤θÀ¸ì¥Ç¡¼¥¿¤òÂоݤȤ¹¤ë¤È¤, ¶¦µ¯É½¸½Ãê½Ð¤ÎÌäÂê¤Ï, Âè1¤Ë, ·×»»ÎÌ(¥Õ¥¡¥¤¥ëÎÌ)ÁýÂç¤Ë¤è¤ë·×»»²ÄÈݤÎÌäÂê¤Ç¤¢¤ê, Âè2¤Ë, ÆÀ¤é¤ì¤¿ÂçÎ̤ηë²Ì¤«¤éɬÍפÊɽ¸½¤òÁªÂò¤¹¤ëÌäÂê¤Ç¤¢¤ë. ÆäË, Î¥»¶·¿¶¦µ¯¤Î¾ì¹ç, ·×»»Î̤Ï, ¤½¤ì¤ò¹½À®¤¹¤ëɽ¸½Í×ÁǤοô¤ËÂФ·¤Æ´ö²¿µé¿ôŪ¤ËÁý²Ã¤¹¤ë¤³¤È¤¬ÌäÂê¤È¤Ê¤ë. ·×»»Î̤òºï¸º¤¹¤ëÊýË¡¤ò¹Í¤¨¤ëºÝ¤Ï, ¶¦µ¯É½¸½Ãê½Ð¤ÎÌÜŪ¤«¤é¹Í¤¨¤Æ, ɬÍפʶ¦µ¯É½¸½¤òϳ¤é¤·¤Æ¤·¤Þ¤¦¤è¤¦¤Ê¹Ê¤ê¹þ¤ß¤Ï˾¤Þ¤·¤¯¤Ê¤¤.
Ï¢º¿·¿¤Îʸ»úÎóÃê½Ð¤Î¾ì¹ç¤Ï, n-gramÅý·×¤ÎÊýË¡¤Ë¤è¤Ã¤Æ, ¤¹¤Ç¤Ë, Âè1¤ÎÌäÂê¤Ï²ò·è¤µ¤ì¤Æ¤¤¤ë¤¬, ɽ¸½¤Îñ°Ì¤È¤ß¤Ê¤»¤Ê¤¤(ñ¸ì¤ÎÃÇÊÒ¤ò´Þ¤à)ÃÇÊÒŪ¤Êʸ»úÎó¤¬Â¿¼ÏÃê½Ð¤µ¤ì¤ë. ¤³¤Î¤¿¤á, Î¥»¶·¿¶¦µ¯¤Î¾ì¹ç, ·×»»Î̤¬ÁýÂ礷, ·×»»ÉÔ²Äǽ¤È¤Ê¤ë¤³¤È¤¬ÌäÂê¤È¤Ê¤ë. ÃÇÊÒŪ¤Êʸ»úÎó¤ÎÃê½Ð¤¬ÍÞÀ©¤µ¤ì, ·×»»Î̤¬²Äǽ¤ÊÈϰϤ˼ý¤Þ¤ì¤Ð, Î¥»¶·¿¶¦µ¯¤Ë¤ª¤¤¤Æ¤â, Âè1¤ÎÌäÂê¤Ï²ò·è¤¹¤ë. ¤Þ¤¿, Âè2¤ÎÌäÂê¤Ë¤Ä¤¤¤Æ¤Ï, ºÇ½ªÅª¤Ë¤Ï, »ÈÍÑÌÜŪ¤´¤È¤Ë¿Í¼ê¤ÇȽÃǤ»¤¶¤ë¤òÆÀ¤Ê¤¤¤«¤é, ½ÐÎϤµ¤ì¤ëʸ»úÎó¤ÎÎÌ(¼ïÎà)¤¬, ¿Í¼êºî¶È¤Ë»Ù¾ã¤Î¤Ê¤¤ÈÏ°Ï(¿ôÀé¼ï, ºÇÂç¿ôËü¼ï°Ê²¼)¤Ë¤Ê¤ì¤Ð, Âè2¤ÎÌäÂê¤âÅöÌ̲ò·è¤·¤¿¤È¸À¤¨¤ë.
°Ê¾å¤Î´ÑÅÀ¤«¤é, ËÜÏÀʸ¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯É½¸½Ãê½Ð¤Ë¤ª¤¤¤ÆÃÇÊÒŪ¤Êʸ»úÎóÃê½Ð¤òÍÞÀ©¤¹¤ëÊýË¡¤È¤·¤Æ, ¸À¸ì¥Ç¡¼¥¿¤ÎÃ椫¤é, ºÇĹ°ìÃפÎʸ»úÎóÃê½Ð(¤¢¤ëʸ»úÎó¤¬Ãê½Ð¤µ¤ì¤¿¤È¤, ¤½¤Îʸ»úÎó¤Ë´Þ¤Þ¤ì¤ëÉôʬʸ»úÎó¤ÏÃê½Ð¤·¤Ê¤¤)¤ò¾ò·ï¤È¤·, Ǥ°Õ¤ÎŤµ°Ê¾å, Ǥ°Õ¤Î»ÈÍÑÉÑÅٰʾå¤Î¶¦µ¯É½¸½¤ò, ϳ¤ì¤Ê¤¯, ¼«Æ°Åª¤ËÃê½Ð¤·, ½¸·×¤¹¤ëÊýË¡¤òÄó°Æ¤¹¤ë. ¼¡¤Ë, ¤½¤Î·ë²Ì¤ò»ÈÍѤ·¤Æ, Ê£¿ô¤ÎÍ×ÁǤ¬Î¥¤ì¤¿°ÌÃ֤˶¦µ¯¤¹¤ëÎ¥»¶·¿¶¦µ¯É½¸½¤ò¼«Æ°Åª¤ËÃê½Ð¤·½¸·×¤¹¤ëÊýË¡¤ò¼¨¤¹. ¤Þ¤¿, Äó°Æ¤·¤¿¼êË¡¤ÎÆ°ºî³Îǧ¤Î¤¿¤á¤ÎŬÍÑÎã¤È¤·¤Æ, ÆüËܸ쿷ʹµ»ö¥Ç¡¼¥¿¤«¤é¤ÎÏ¢º¿·¿, Î¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½Ð·ë²Ì¤ò¼¨¤¹.
(1) ʸ»úÎóÃê½Ð¤Î¾ò·ï
¼«Á³¸À¸ì¤ÎʸÃæ¤Ç¶¦µ¯¤¹¤ëɽ¸½¡ù¤È¤·¤Æ¤Ï, Ï¢¸ì¤ä¥Õ¥ì¡¼¥º¤Î¤è¤¦¤ËϢ³¤·¤¿Ê¸»úÎó¤ò¹½À®¤¹¤ë¤â¤Î(Ï¢º¿·¿¶¦µ¯É½¸½¤È¸Æ¤Ö)¤È, ·¸¤ê·ë¤Ó, ¸Æ±þ´Ø·¸, ÆÃÄê¤ÎÆ°»ì¤ÈÆÃÄê¤Î̾»ì¤ÎÁȤʤɤΤ褦¤Ë, 2¼ïÎà°Ê¾å¤Îʸ»úÎó¤¬, ʸÃæ¤ÎÎ¥¤ì¤¿°ÌÃ֤˸½¤ì¤ë¤â¤Î(Î¥»¶·¿¶¦µ¯É½¸½¤È¸Æ¤Ö)¤¬¤¢¤ë. Î¥»¶·¿¶¦µ¯É½¸½¤Ï, Ï¢º¿·¿¶¦µ¯É½¸½¤Îʸ»úÎó¤¬Ê¸Ãæ¤Ç¶¦µ¯¤·¤¿¤â¤Î¤È¹Í¤¨¤ë¤³¤È¤¬¤Ç¤¤ë¤«¤é, ¤Þ¤º, Á°¼Ô¤Îʸ»úÎó¤ò¹Í¤¨¤ë.
¤µ¤Æ, Ï¢¸ì¤ä¥Õ¥ì¡¼¥º¤Î¤è¤¦¤ÊϢ³¤·¤¿Ê¸»úÎó¤òϳ¤ì¤Ê¤¯È¯¸«¤¹¤ë¤³¤È, ¤Þ¤¿, ʸˡŪ, °Ọ̃Ū¤Ë¸«¤Æ, ɽ¸½¤Îñ°Ì¤ò¤Ê¤µ¤Ê¤¤¤è¤¦¤ÊÃÇÊÒŪ¤Êʸ»úÎó¤ÎÃê½Ð¤òºÇ¾®¸Â¤Ë²¡¤µ¤¨¤ë¤³¤È¤òÁÀ¤Ã¤Æ, °Ê²¼¤Î¾ò·ï¤Çʸ»úÎó¤òÃê½Ð¤¹¤ë¤³¤È¤È¤¹¤ë.
Âè1¤Î¾ò·ï: | Ǥ°Õ¤ÎŤµ°Ê¾å¤Îʸ»úÎó¤òÃê½Ð¤¹¤ë. | |
Âè2¤Î¾ò·ï: | Ǥ°Õ¤Î½Ð¸½ÉÑÅٰʾå¤Îʸ»úÎó¤òÃê½Ð¤¹¤ë. | |
Âè3¤Î¾ò·ï: | ºÇĹ°ìÃפθ¶Â§¤Çʸ»úÎó¤òÃê½Ð¤¹¤ë. |
¤³¤Î¤¦¤ÁÂè3¤Î¾ò·ï¤Ï, ¸¶Ê¸Ãæ¤Î¤¢¤ë¾ì½ê¤«¤é¤¢¤ëʸ»úÎ󤬰ìÅÙÃê½Ð¤µ¤ì¤¿¸å¤Ï, ¤½¤Îʸ»úÎóÆâ¤Ë´Þ¤Þ¤ì¤ëÉôʬʸ»úÎó¤ÏÃê½Ð¤ÎÂоݤȤ·¤Ê¤¤¤³¤È¤ò°ÕÌ£¤¹¤ë. ¤¿¤À¤·, ¤½¤ÎÉôʬʸ»úÎó¤¬Ê̤ξì½ê¤Ë¸½¤ì¤¿»þ¤ÏÃê½Ð¤µ¤ì¤ë. Î㤨¤Ð, ¿Þ1¤Î¾ì¹ç, 7gram¤Îʸ»úÎó¦Á¤¬Ãê½Ð¤µ¤ì¤¿¤È¤¹¤ë¤È, ¤½¤ì°Ê¹ß¤Î6gram°Ê²¼¤Îʸ»úÎó¤ÎÃê½Ð¤Ç¤Ï, ¦ÁÉôʬ¤ÎÉôʬʸ»úÎó¤Ç¤¢¤ë¦Â¤ä¦Ã¤ÏÂоݳ°¤È¤¹¤ë. ¤¿¤À¤·, ¦Á¤¬Ãê½Ð¤µ¤ì¤¿¾ì½ê°Ê³°¤Î°ÌÃ֤˸½¤ì¤¿¡ÖDE¡×, ¡ÖGHI¡×¤ÏÅöÁ³, Ãê½Ð¤ÎÂоݤȤʤë. ¤Þ¤¿, ʸ»úÎó¦Ä¤Ï, ¦Á¤ÎÉôʬʸ»úÎó¤Ç¤Ê¤¤¤Î¤Ç, Ãê½Ð¤ÎÂоݤȤ¹¤ë.
(2) ɽ¸½Ãê½Ð¤Ë¤ª¤±¤ëºÇĹ°ìÃפθ¶Â§¤Î°ÕµÁ
°ìÈ̤˸À¸ìɽ¸½¤Ï, Âç¾®¤Îɽ¸½¤¬´ö½Å¤Ë¤â¥Í¥¹¥È¤·¤Æ¹½À®¤µ¤ì¤ë. ¶¦µ¯É½¸½¤ÎÃê½Ð¤Ç¤Ï, ɽ¸½¤Îñ°Ì¤äŤµ¤ò¤¢¤é¤«¤¸¤á»ØÄꤷ¤Ê¤¯¤Æ¤â, ¤³¤Î¤è¤¦¤Êɽ¸½¤ÎÃ椫¤é, ·«¤êÊÖ¤·»ÈÍѤµ¤ì¤ëɽ¸½¤Îñ°Ì¤ò¼«Æ°Åª¤Ëȯ¸«¤·, Ãê½Ð¤Ç¤¤ë¤³¤È¤¬Ë¾¤Þ¤ì¤ë. ¤½¤³¤Ç, ¤¹¤Ù¤Æ¤Îʸ»úÎó¤òÌÖÍåŪ¤ËÃê½Ð¤¹¤ì¤Ð, ¤½¤Î¤è¤¦¤Êɽ¸½¤ÏÃê½Ð¤µ¤ì¤ë¤¬, °ìÅÙÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÃ椫¤é¤âÉôʬʸ»úÎ󤬽ÅÊ£¤·¤ÆÃê½Ð¤µ¤ì¤ë¤¿¤á, ¿¤¯¤ÎÃÇÊÒŪ¤Êʸ»úÎ󤬴ޤޤì¤ë¤³¤È¤¬ÌäÂê¤È¤Ê¤ë.
¤È¤³¤í¤Ç, ¸À¸ì¤Î¶¦µ¯É½¸½¤Ï, Ê£¿ô¤Îñ¸ì¤¬¶¦µ¯¤·¤¿É½¸½¤À¤È¹Í¤¨¤ë¤È, ¶¦µ¯É½¸½¤Îʸ»úÎó¤Î¶³¦¤Ï, Ʊ»þ¤Ëñ¸ì¶³¦¤È¤â¤Ê¤Ã¤Æ¤¤¤ë. °ìÊý, ²Äǽ¤Ê¸Â¤êŤ¤Ã±°Ì¤Çʸ»úÎó¤òÃê½Ð¤¹¤ì¤Ð, ¤½¤Îʸ»úÎó¤Î¶³¦¤Ïñ¸ì¶³¦¤Ë°ìÃפ¹¤ë²ÄǽÀ¤¬¹â¤¤¤«¤é, ÃÇÊÒŪ¤Êʸ»úÎó¤Ç¤Ï¤Ê¤¯¶¦µ¯É½¸½¤Ç¤¢¤ë²ÄǽÀ¤¬¹â¤¯¤Ê¤ë. ¤¹¤Ê¤ï¤Á, ÃÇÊÒŪʸ»úÎó¤ÎÃê½Ð¤¬ÍÞÀ©¤µ¤ì¤ë¤È´üÂÔ¤µ¤ì¤ë. °Ê¾å¤«¤é(1)¤Ç¤Ï, Âè3¤Î¾ò·ï¤òÀߤ±¤¿.
¤³¤³¤Ç, Âè3¤Î¾ò·ï¤ÇÃê½Ð¤¬ÍÞÀ©¤µ¤ì¤ëʸ»úÎó¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë. ÍÞÀ©¤µ¤ì¤ëʸ»úÎó¤Ë¤Ï, ¤è¤êÂ礤Êʸ»úÎó¤ÎÉôʬ¤È¤·¤Æ¤·¤«»ÈÍѤµ¤ì¤Ê¤¤¤¿¤á, °ìÅÙ¤âÃê½Ð¤µ¤ì¤Ê¤¤¤â¤Î¤È, ¾¤ÎÉôʬ¤«¤é¤ÏÆÈΩÀ¤Î¤¢¤ëɽ¸½¤È¤·¤Æ²¿²ó¤«Ãê½Ð¤µ¤ì¤ë¤¬, ¤¢¤ëʸ»úÎó¤ÎÉôʬʸ»úÎó¤È¤·¤Æ»ÈÍѤµ¤ì¤¿Éôʬ¤Ç¥«¥¦¥ó¥È¤¬ÍÞÀ©¤µ¤ì¤í¤â¤Î¤¬¤¢¤ë. ¶¦µ¯É½¸½¤ÎÌÖÍåÀ¤Î´ÑÅÀ¤«¤é¸«¤ì¤Ð, ¤³¤Î¤¦¤Á, Á°¼Ô¤ÎÃê½Ðϳ¤ì¤¬ÌäÂê¤Ç, ¤½¤ÎÃæ¤Ë, ɽ¸½¤È¤ß¤Ê¤»¤ëʸ»úÎ󤬴ޤޤì¤ë¤«¤É¤¦¤«¤¬ÂçÀڤǤ¢¤ë.
¤·¤«¤·, ¤¢¤ëɽ¸½¤¬¤è¤êÂ礤Êʸ»úÎó¤ÎÃæ¤ËËä¤â¤ì¤¿ÉôʬŪ¤Êɽ¸½¤Ç¤¢¤Ã¤Æ¤â, ÆÈΩÀ¤¬¹â¤¯, ·«¤êÊÖ¤·¤Æ»ÈÍѤµ¤ì¤ë¤è¤¦¤Êɽ¸½¤Ç¤¢¤ì¤Ð, ¤¢¤ëʸ»úÎó¤ÎÉôʬʸ»úÎó¤È¤·¤Æ¤À¤±¤Ç¤Ê¤¯, ¤½¤ì¼«¿È¤¬ºÇŤÎñ°Ì¤Ç¤¢¤ë¤è¤¦¤Êʸ»úÎó¤È¤·¤Æ·«¤êÊÖ¤·½Ð¸½¤¹¤ë¤³¤È¤¬ ´üÂԤǤ¤ë¡ù. °Ê¾å¤«¤é, Âè3¤Î¾ò·ï¤¬¤¢¤Ã¤Æ¤â, ·«¤êÊÖ¤·»ÈÍѤµ¤ì¤ë¶¦µ¯É½¸½(¤Î¼ïÎà)¤Ï, ÌÖÍåŪ¤ËÃê½Ð¤µ¤ì¤ë¤â¤Î¤È´üÂԤǤ¤ë.
(3) ĹÈø¡¦¿¹¤ÎÊýË¡¤È¤½¤ÎÌäÂêÅÀ
Ǥ°Õ¤În¤ËÂФ¹¤ën-gram¤ò¸úΨŪ¤ËÃê½Ð¤·¤Æ½¸·×¤¹¤ëÊýË¡¤È¤·¤Æ, ´û¤Ë, ĹÈø¡¦¿¹¤ÎÊýË¡6)¤¬Äó°Æ¤µ¤ì¤Æ¤¤¤ë. ¤³¤ÎÊýË¡¤òÍ×Ì󤹤ë¤È°Ê²¼¤Î¤È¤ª¤ê¤Ç¤¢¤ë.
[ĹÈø¡¦¿¹¤ÎÊýË¡]
½¸·×ÂоݤȤ¹¤ë¸À¸ì¥Ç¡¼¥¿Á´ÂΤÎʸ»ú¿ô¤ò N ¤È¤¹¤ë.
¼ê½ç1: ¡Ö¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤ÎºîÀ®¡×
N ¸Ä¤Î¥ì¥³¡¼¥É¤«¤é¤Ê¤ë¥Õ¥¡¥¤¥ë(¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë)¤òÍÑ°Õ¤·, ³Æ¥ì¥³¡¼¥É¤Ë, 0¤«¤é½ç¤Ë N -1 ¤ÎÃÍ(¸¶Ê¸ÈÖÃÏ)¤ò¤¤¤ì¤ë. ¸¶Ê¸ÈÖÃϤÏ, ¸À¸ì¥Ç¡¼¥¿¾å, ¤½¤ÎÃͤǼ¨¤µ¤ì¤ëʸ»úÈֹ椫¤é»Ï¤Þ¤ê, ËöÈø( N -1 ÈÖÌܤÎʸ»ú)¤Ç½ª¤ï¤ëÉôʬʸ»úÎó(°Ê²¼, ʸ»úÎóñ¸ì¤È¸Æ¤Ö)¤Ø¤Î ¥Ý¥¤¥ó¥¿¤Î°ÕÌ£¤ò»ý¤Ä.
¼ê½ç2: ¡ÖÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ÎºîÀ®¡×
¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤ò, Âбþ¤¹¤ëʸ»úÎóñ¸ì¤Îʸ»ú¥³¡¼¥É½ç¤Ë, ¥½¡¼¥È¤·¤¿¥Õ¥¡¥¤¥ë(ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë)¤ò¤Ä¤¯¤ë.
¼ê½ç3: ¡Ö°ìÃ×ʸ»ú¿ô¤Î¥«¥¦¥ó¥È¡×
ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤Î¼¨¤¹Ê¸»úÎóñ¸ì¤ò, ¤½¤Îľ¸å¤Î¥ì¥³¡¼¥É¤Îʸ»úÎóñ¸ì¤ÈÀèƬʸ»ú¤«¤éÈæ³Ó¤·, °ìÃפ·¤¿Ê¸»ú¿ô(°ìÃ×ʸ»ú¿ô)¤ò½ñ¤¹þ¤à.
¼ê½ç4: ¡Öʸ»úÎó¤ÎÃê½Ð¤È¥«¥¦¥ó¥È¡×
°ìÃ×ʸ»ú¿ô¤ò¥ì¥³¡¼¥É½ç¤ËÄ´¤Ù, Éôʬʸ»úÎó¤Î¼ïÎà¤È¤½¤Î½Ð¸½²ó¿ô¤òÊÔ½¸¤¹¤ë.
¤³¤ÎÊýË¡¤Ë¤è¤ê, Ǥ°Õ¤Î²ó¿ô°Ê¾å½Ð¸½¤·¤¿Ê¸»úÎó¤òŤµ(ʸ»ú¿ô)¤´¤È¤Ë, ¤«¤Ä, ½Ð¸½²ó¿ô¤ÎÂ礤¤½ç¤ËÆÀ¤ë¤³¤È¤¬¤Ç¤¤ë¤¿¤á, ÌÜɸ¤È¤¹¤ëÂè1, Âè2¤Î¾ò·ï¤ÏËþ¤µ¤ì¤ë¤¬, ÌÜɸ¤È¤¹¤ëÂè3¤Î¾ò·ï¤ÏËþ¤µ¤ì¤Ê¤¤.
Âè3¤Î¾ò·ï¤òËþ¤¿¤¹¤è¤¦¤Ë¤¹¤ë¤¿¤á, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Î½Ð¸½²ó¿ô¤ËÂФ·¤Æ, ¤è¤êŤ¤Ê¸»úÎó¤Ë´Þ¤Þ¤ì¤Æ¤¤¤¿Éôʬʸ»úÎó¤Î½Ð¸½²ó¿ô¤òº¹¤·°ú¤¯¤Ê¤É, ¼¡¿ô¤Î°Û¤Ê¤ëÊ£¿ô¤În-gram ½¸·×ɽ¤òÁȤ߹ç¤ï¤»¤Æ·×»»¤¹¤ëÊýË¡¤¬¹Í¤¨¤é¤ì¤ë¤¬, ½¸·×ɽ¤¬À¸À®¤µ¤ì¤¿»þÅÀ¤Ç¤Ï, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Î¸¶Ê¸Ãæ¤Ç¤ÎÁê¸ß´Ø·¸¤Î¾ðÊ󤬼º¤ï¤ì¤Æ¤¤¤ë¤¿¤á, ·×»»¤ÏÉÔ²Äǽ¤Ç¤¢¤ë¡ù.
ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ËÌá¤Ã¤Æ, ¸À¸ì¥Ç¡¼¥¿¤ÎÃæ¤Ç, °ìÅÙÃê½Ð¤·¤¿Ê¸»úÎó¤ÎÉôʬ¤ÏÊ̤Îʸ»úÎó¤È¤·¤Æ²þ¤á¤ÆÃê½Ð¤·¤¿¤ê, ¥«¥¦¥ó¥È¤·¤¿¤ê¤·¤Ê¤¤ÊýË¡¤ò¹Í¤¨¤ë. °Ê²¼¤Ç¤Ï, ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤«¤é, °ìÃ×ʸ»ú¿ô¤Î¿¤¤½ç¤Ë, Éôʬʸ»úÎó¤òÃê½Ð¤¹¤ë¤â¤Î¤È¤·¤ÆµÄÏÀ¤¹¤ë.
¤µ¤Æ, n-gram ʸ»úÎó¤Èm-gram ʸ»úÎó¤ÎÃê½Ð¤ò¹Í¤¨¤ë. n ¡äm ¤È¤¹¤ë¤È, ¾ò·ï¤è¤ê, n-gram ʸ»úÎó¤ÎÃê½Ð¤Ï, m-gram ʸ»úÎó¤ËÀèΩ¤Ã¤Æ¼Â¹Ô¤µ¤ì¤ë. ¸¶Ê¸¾å, n-gram ʸ»úÎó¤Èm-gram ʸ»úÎ󤬶¦ÄÌÉôʬ¤ò»ý¤Ä¾ì¹ç¤¬ÌäÂê¤È¤Ê¤ë¤«¤é, ¤½¤ì¤òʬÎह¤ë¤È, ¿Þ2¤Î¤è¤¦¤Ë, m-gram ʸ»úÎó¤¬n-gram ʸ»úÎóÆâ¤ËÆâÊñ¤µ¤ì¤ë¾ì¹ç¤È, m-gram ʸ»úÎó¤Èn-gram ʸ»úÎ󤬸ߤ¤¤Ë¤½¤ÎÉôʬ¤ò¶¦Í¤¹¤ë¾ì¹ç¤Ëʬ¤±¤é¤ì¤ë.
(1) ̵¸ú²½¤ÎɬÍפʥ쥳¡¼¥É¤ÎÈÏ°Ï
n-gram ¤¬Àè¹Ô¤·¤ÆÃê½Ð¤µ¤ì¤¿¤È¤, case1 ¤Îm-gram ¤Ï, ¤¤¤º¤ì¤âÃê½ÐÂоݤȤʤé¤Ê¤¤. ¤·¤¿¤¬¤Ã¤Æ, n-gram ʸ»úÎó¤òÃê½Ð¤¹¤ë¤È¤, ¤³¤Î¤è¤¦¤Ê´Ø·¸¤Ë¤¢¤ëm-gram ¤Ï, ¸å¤Î½èÍý¤ÇÃê½Ð¤µ¤ì¤Ê¤¤¤è¤¦¤Ë¤¹¤ëɬÍפ¬¤¢¤ë. ¤½¤³¤Ç, n-gram ¤¬Ãê½Ð¤µ¤ì¤¿¤È¤, ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¾å¤Ç, ¤½¤ì¤ËÊñ´Þ¤µ¤ì¤ëm-gram ¤òõ¤·¤Æ, ³ºÅö¥ì¥³¡¼¥É¤¬Ìµ¸ú¤È¤µ¤ì¤ë¾ò·ï¤òÉÕÍ¿¤¹¤ëÊýË¡¤ò¹Í¤¨¤ë.
¤½¤³¤Ç, ¤Þ¤ºÌµ¸ú²½¤ÎÂоݤȤʤë¥ì¥³¡¼¥É¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë¤È, case 1-1 ¤Î¾ì¹ç¤Ï, Ãê½Ð¤µ¤ì¤¿n-gram ¤Î¥ì¥³¡¼¥É¼«ÂΤ¬ºÆ¤ÓÃê½Ð¤ÎÂоݤˤʤé¤Ê¤¤¤è¤¦¤Ë¤¹¤ì¤Ð¤è¤¤. ¼¡¤Ë, case 1-2, case 1-3 ¤Î¾ì¹ç¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë¤È, ̵¸ú²½¤ÎÂоݤȤʤë¥ì¥³¡¼¥É¤Ï, ¸¶Ê¸¾å, ÃåÌܤ¹¤ë n-gram ¤Î³«»Ïʸ»ú¤Î°ÌÃÖ¤«¤é¿ô¤¨¤Æ n ʸ»úÀè¤Þ¤Ç¤Î³Æʸ»ú¤òÀèƬʸ»ú¤È¤¹¤ëʸ»úÎóñ¸ì¤Î¥ì¥³¡¼¥É¤Ç¤¢¤ë¤³¤È¤¬Ê¬¤«¤ë.
¼¡¤Ë, ̵¸ú²½¤Î¾ò·ï¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë¤È, case 2-2 ¤Î¾ì¹ç¤Îm-gram ¤Ï̵¸ú²½¤·¤Æ¤Ï¤Ê¤é¤Ê¤¤¤«¤é, ¾åµ¤ÎÂоݥ쥳¡¼¥É¤Î¤¦¤Á, ̵¸ú²½¤¹¤ë¥ì¥³¡¼¥É¤Ï, °ìÃ×ʸ»ú¿ô¤¬¤½¤ì¤¾¤ì n - 1, n - 2, ¡¦¡¦¡¦, 1 °Ê²¼¤Î¥ì¥³¡¼¥É¤Ë¸Â¤é¤ì¤ë¤³¤È¤¬Ê¬¤«¤ë. ¤Ê¤ª, case 2-1 ¤Ë¤¢¤ë¤è¤¦¤Êm-gram ¤Î¾ì¹ç¤Ï, ¾åµ¤Î̵¸ú²½½èÍý¤ÎÂоݳ°¤È¤Ê¤Ã¤Æ¤ª¤ê, Ãê½Ð½¸·×¤ÎÂоݤȤʤë.
°Ê¾å¤Î̵¸ú²½½èÍý¤ÎÂоÝÈϰϤˤĤ¤¤Æ, ¿Þ3 ¤ËÎã¤ò¼¨¤¹. ¿Þ¤Ç¤Ï, ¸¶Ê¸ÈÖÃÏ3 ¤Î¥ì¥³¡¼¥É¤«¤é6 gram ¤Îʸ»úÎó, ¡ÖC¡ÁH¡× ¤¬Ãê½ÐÂоݤÈȽÃǤµ¤ì¤¿¤È¤¤Ï, ¸¶Ê¸ÈÖÃÏ4¡Á8 ¤Îʸ»úÎó¤ÎH ¤Þ¤Ç¤ÎÉôʬ¤¬Ìµ¸ú²½¤µ¤ì¤ë¤³¤È¤ò¼¨¤·¤Æ¤¤¤ë.
(2) ̵¸ú²½¤¹¤Ù¤¥ì¥³¡¼¥É¤Î¸¡º÷
ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Î¥ì¥³¡¼¥É¤Ï, ʸ»úÎóñ¸ì¤ò¼¨¤¹¸¶Ê¸ÈÖÃϤÎÃÍi ¤ËÂФ·¤Æ½çÉÔƱ¤Ëʤó¤Ç¤¤¤ë. ¤½¤Î¤¿¤á, ¤¢¤ë¥ì¥³¡¼¥É¤Î¸¶Ê¸ÈÖÃϤÎÃÍi ¤ò¸«¤Æ, ¸¶Ê¸ÈÖÃϤÎÃͤ¬, i + 1, i + 2, ¡¦¡¦¡¦ ¤È¤Ê¤Ã¤Æ¤¤¤ë¥ì¥³¡¼¥É¤òõ¤¹¤Ë¤Ï, ¥·¡¼¥±¥ó¥·¥ã¥ë¥µ¡¼¥Á¤¬É¬ÍפÇ, ¸¡º÷»þ´Ö¤¬Â礤ÊÌäÂê¤È¤Ê¤ë. ¤³¤ì¤ËÂФ·¤Æ, ¸µ¤Î¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤Ç¤Ï, ¥ì¥³¡¼¥É¤Ï¸¶Ê¸¹áÃϤÎÃÍi ¤Î½ç¤Ëʤó¤Ç¤¤¤ë. ¤¹¤Ê¤ï¤Á, ̵¸ú²½¤ÎÍ׵᤬ȯÀ¸¤·¤¿¥ì¥³¡¼¥É¤Ë°ú¤Â³¤¤¤Æ, ̵¸ú²½¤ò¥Á¥§¥Ã¥¯¤¹¤Ù¤¥ì¥³¡¼¥É¤¬½çÈÖ¤Ëʤó¤Ç¤¤¤ë¤¿¤á, ¸¡º÷¤Ï¹â®¤Ë¼Â¹Ô¤Ç¤¤ë. ¤½¤³¤Ç, ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ò¤â¤¦°ìÅÙ, ¸¶Ê¸ÈÖÃϤÎÃͤνç¤ËºÆ¥½¡¼¥È¤·, ÆÀ¤é¤ì¤¿¥Õ¥¡¥¤¥ë¾å¤Ç̵¸ú²½½èÍý¤ò ¹Ô¤¦¤â¤Î¤È¤¹¤ë¡ù.
(1) ¥¢¥ë¥´¥ê¥º¥à
Á°¾Ï¤ÎµÄÏÀ¤ò¤Õ¤Þ¤¨, ¸À¸ì¥Ç¡¼¥¿¤«¤é, 2 ²ó°Ê¾å¤Î½Ð¸½²ó¿ô¤ò»ý¤Ä¸ÇÄêŪ¤Ê(ÆÈΩÀ¤Î¹â¤¤) ɽ¸½¤òʸ»úÎó¤È¤·¤Æ, ʸ»ú¿ô¤Î¿¤¤½ç¤Ë, ¤«¤Ä, ½ÅÊ£¤Ê¤·¤ËÃê½Ð¤¹¤ë¥¢¥ë¥´¥ê¥º¥à¤òÄó°Æ¤¹¤ë.
[ʸ»úÎóÃê½Ð¥¢¥ë¥´¥ê¥º¥à]
¼ê½ç1¡Á¼ê½ç3 : ĹÈø¡¦ ¿¹¤ÎÊýË¡¤ÈƱ¤¸
¼ê½ç4 : ¡ÖÃ껳ʸ»ú¿ô¤ÎµÆþ¡×
ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤Î¼¨¤¹Ê¸»úÎóñ¸ì¤Ë¤Ä¤¤¤Æ, ÀèƬ¤«¤é²¿Ê¸»úÃê½ÐÂоݤȤʤäƤ¤¤ë¤«(Ãê½Ðʸ»ú¿ô) ¤òÄ´¥Ù, ¥ì¥³¡¼¥É¤ËµÆþ¤¹¤ë (³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤¬¤Ç¤¤ë). Ãê½Ðʸ»ú¿ô¤Ï, Á°¸å¤Î¥ì¥³¡¼¥É¤Î°ìÃ×ʸ»ú¿ô¤Î´Ø·¸¤«¤é´Êñ¤Ë·è¤Þ¤ë.
¼ê½ç5 : ¡Ö³ÈÄ¥¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤ÎºîÀ®¡×
³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ò¸¶Ê¸ÈÖ¹æ½ç¤Ë¥½¡¼¥È¤·Ä¾¤·, ³ÈÄ¥¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤È¤¹¤ë.
¼ê½ç6 : ¡Ö͸ú̵¸úȽÄê½èÍý¡×
³ÈÄ¥¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤ÎÃê½Ðʸ»ú¿ô¤ò½ç¤ËÄ´¥Ù, ³Æ¥ì¥³¡¼¥É¤Î̵¸úȽÄê¤ò¹Ô¤¦. ¤½¤Î·ë²Ì¤ÏºÎÈÝɽ¼¨¤ÎÃͤȤ·¤ÆµÆþ¤¹¤ë. ̵¸úȽÄê¤ÎÊýË¡¤Ï, 3.1 Àá¤Ç½Ò¤Ù¤¿¤È¤ª¤ê¤Ç¤¢¤ë.
¾åµ¤ÇÆÀ¤é¤ì¤¿³ÈÄ¥¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤òºÆÅÙ, ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Î¥ì¥³¡¼¥É½ç¤Ë¥½¡¼¥È¤·, ¤³¤ì¤òºÆ³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤È¤¹¤ë.
¼ê½ç8 : ¡ÖÃê½Ðʸ»úÎ󽸷׽èÍý¡×
ºÆ³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ÎºÎÈÝɽ¼¨, Ãê½Ðʸ»ú¿ô, °ìÃ×ʸ»ú¿ô¤Î´Ø·¸¤òÄ´¤Ù¤ÆÃê½Ð¤¹¤ëʸ»úÎó¤ò·èÄꤷ, Ʊ»þ¤Ë, ¤½¤Î½Ð¸½²ó¿ô¤òµá¤á¤ë.
¤³¤Î¤È¤, Á°¸å¤Î¥ì¥³¡¼¥É¤Î°ìÃ×ʸ»ú¿ô¤Î´Ø·¸¤«¤éÃê½Ðʸ»ú¿ô¤Ïµá¤á¤é¤ì¤ë (¼ê½ç4 »²¾È) ¤¿¤á, Ãê½Ðʸ»ú¿ô¤Ï»²¾È¤·¤Ê¤¯¤Æ¤â½¸·×¤Ç¤¤ë.
(2) ÎãÂ긡Ƥ
°Ê¾å¤Î¥¢¥ë¥´¥ê¥º¥à¤ÎŬÍÑÎã¤ò¿Þ4 ¤Ë¼¨¤¹. ¤³¤ÎÎã¤Ç¤Ï, n-gram Åý·×¤ÇÃê½Ð¤µ¤ì¤ëʸ»úÎó¤Î¼ïÎब24 ¼ïÎà¤Ç, ±ä¤Ù½Ð¸½²ó¿ô¤¬72 ²ó¤Ç¤¢¤ë¤Î¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, 5 ¼ïÎà, 10 ²ó¤Ë¹Ê¤é¤ì¤ë.
|
Æó¤Ä°Ê¾å¤Îɽ¸½¤¬, 1 ʸÃæ¤ÎÎ¥¤ì¤¿°ÌÃ֤˶¦µ¯¤¹¤ë¤è¤¦¤Êɽ¸½¤ÎÁÈ(Î¥»¶·¿¶¦µ¯É½¸½) ¤È, ¤½¤Î½Ð¸½²ó¿ô¤òµá¤á¤ëÊýË¡¤ò¹Í¤¨¤ë. Ï¢º¿·¿¶¦µ¯É½¸½¤ÎÃê½Ð (3 ¾Ï¤ÎÊýË¡) ¤Ç¤Ï, Ê£¿ô¤Îʸ¤Ë¤Þ¤¿¤¬¤ëʸ»úÎó¤ÏÃê½Ð¤ÎÂоݳ°¤È¤·¤¿¤¿¤á, Ãê½Ð¤µ¤ì¤¿Ï¢º¿·¿¶¦µ¯É½¸½¤Ï, ʸÆâ¤ËÊĤ¸¤Æ¤¤¤ë. ¤·¤¿¤¬¤Ã¤Æ, Î¥»¶·¿¶¦µ¯É½¸½¤òÃê½Ð¤¹¤ë¤Ë¤Ï, ¸À¸ì¥Ç¡¼¥¿¤òÀèƬ¤Îʸ¤«¤é½ç¤Ë¥µ¡¼¥Á¤·, Ï¢º¿·¿¶¦µ¯É½¸½¤Îʸ»úÎó¤ÎÁȤ¬1 ʸÃæ¤Ë¸½¤ì¤ë¸½¾Ý¤ò, ʸ»úÎó¤ÎÁȤ´¤È¤Ë¥«¥¦¥ó¥È¤¹¤ì¤Ð¤è¤¤¤¬, ʸ¶³¦Ê¸»ú(¶çÅÀ) ¤Î°·¤¤¤ÈÃê½Ð¤¹¤ëɽ¸½¤Î°ÌÃÖ´Ø·¸¤¬ÌäÂê¤È¤Ê¤ë.
(1) ¶çÅÀ¤Î°·¤¤
Ä̾ï, ÆüËÜʸ¤Ï¶çÅÀ¤Ç½ª¤ï¤ë¤¿¤á, ¶çÅÀ¤«¤é¶çÅÀ¤Þ¤Ç¤ò1 ʸ¤È¤¹¤ë. °úÍÑʸÅù, 1 ʸÆâ¤Ë¶çÅÀ¤ò»ý¤ÄÊ̤Îʸ¤Ê¤É¤òÆâÊñ¤¹¤ëʸ¤Ç¤Ï, ´Êñ¤Î¤¿¤á, ÆâÊñ¤µ¤ì¤ëʸ(ÂФȤʤäƤ¤¤ë°úÍѵ¹æ¤Î¶è´Ö) ¤Ï̵»ë¤¹¤ë.
(2) Ãê½Ð¤¹¤ëʸ»úÎó¤ÎÁê¸ß´Ø·¸
Î¥»¶·¿¤Îʸ»úÎ󶦵¯¤Ç¤Ï, ʸÃæ¤Ç, ¸ß¤¤¤ËÀܳ¤·¤¿Ê¸»úÎó¤äÉôʬŪ¤Ë¥ª¡¼¥Ð¥é¥Ã¥×¤¹¤ëʸ»úÎó¤ÎÁȤÏÃê½Ð¤ÎÂоݳ°¤È¤Ê¤ë. ¤½¤³¤Ç, 3 ¾Ï¤ÇÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÁê¸ß´Ø·¸¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë.
¤µ¤Æ, ʸ»úÎó¦Á¤È¦Â¤¬Æ±°ì¤Îʸ¤«¤éÃê½Ð¤µ¤ì¤¿Ï¢º¿·¿Ê¸»úÎó¤È¤¹¤ë¤È, ¤½¤Î¸¶Ê¸¾å¤Î°ÌÃÖŪ´Ø·¸¤Ï, ¿Þ5 ¤Ë¼¨¤¹¤è¤¦¤Ê»°¤Ä¤Î´Ø·¸¤Î¤¤¤º¤ì¤«¤È¤Ê¤ë. ʸ»úÎó¦Á¤È¦Â¤¬Ê¬Î¥¤·¤Æ¤¤¤ë(c) ¤Î¾ì¹ç¤Ï, ÅöÁ³, Î¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½ÐÂоݤˤʤ뤫¤é, ¤³¤³¤Ç¤Ï, (a), (b)¤Î¾ì¹ç¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë.
(a) ʸ»úÎó¦Á¤È¦Â¤¬Àܳ¤·¤Æ¤¤¤ë¾ì¹ç
¸À¸ì¥Ç¡¼¥¿Ãæ, ¤³¤Î¤è¤¦¤Êʸ»úÎó¤ò´Þ¤à¾ì½ê¤Ï, ºÇÂç1 ¥«½ê¤Ç¤¢¤ë. ¤Ê¤¼¤Ê¤é, ¤½¤Î¤è¤¦¤Êʸ»úÎó¤ò´Þ¤à¾ì½ê¤¬2 ¥«½ê°Ê¾å¤¢¤ë¾ì¹ç¤Ï, ʸ»úÎó¦Á¦Â¤¬¤è¤êʸ»ú¿ô¤Î¿¤¤Ê¸»úÎó¦Ã ¤È¤·¤Æ½¸·×¤µ¤ì, ¤½¤ì¤é¤ÎʸÃæ¤ÎÉôʬʸ»úÎó¦Á¤ª¤è¤Ó¦Â¤Ï¥«¥¦¥ó¥È¤µ¤ì¤Ê¤¤¤«¤é¤Ç¤¢¤ë. ¤·¤¿¤¬¤Ã¤Æ, ʸ»úÎó¦Á¤È¦Â¤¬Ê¸Ãæ¤Ë¶¦µ¯¤¹¤ë²ó¿ô¤¬2 ²ó°Ê¾å¤¢¤ë¾ì¹ç¤Ï, ºÇÂç1 ʸ¤ò½ü¤¯Â¾¤Î³ºÅö¤¹¤ëʸ¤Ï (c) ¤Î¥¿¥¤¥×(ʬΥ·¿) ¤Î¶¦µ¯¤È¤Ê¤Ã¤Æ¤¤¤ë. ¤³¤Î¾ì¹ç, (a) ¤Î¥¿¥¤¥×¤Î¶¦µ¯¤Ï, Ä̾ï, ʬΥ·¿¤Ç¶¦µ¯¤¹¤ëʸ»úÎ󤬤¿¤Þ¤¿¤ÞÀܳ¤·¤¿¤â¤Î¤È¤ß¤Ê¤»¤ë¤«¤é, Î¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½ÐÂоݤȤʤë.
(b) ʸ»úÎó¦Á¤È¦Â¤¬¥ª¡¼¥Ð¥é¥Ã¥×¤·¤Æ¤¤¤ë¾ì¹ç
ʸ»úÎó¦Á¤È¦Â¤òÊñ´Þ¤¹¤ëʸ»úÎó¤ò¦Ã ¤È¤¹¤ë. Á°¹à¤ÈƱÍÍ, ¤³¤Î¤è¤¦¤Êʸ»úÎó¦Ã ¤¬, ¸À¸ì¥Ç¡¼¥¿Æâ¤Ë2 ¥«½ê°Ê¾å½Ð¸½¤·¤¿¾ì¹ç¤Ï, ¦Ã ¼«¿È¤¬Ï¢º¿·¿¶¦µ¯É½¸½¤ÎÃê½Ð¤ÎÂоݤȤʤê, ¤½¤ÎÉôʬ¤Ë´Þ¤Þ¤ì¤¿Ê¸»úÎó¦Á ¤È¦Â¤Ï, Ãê½Ð¤µ¤ì¤Ê¤¤. ¤·¤¿¤¬¤Ã¤Æ, ¸¶Ê¸Ãæ, (b) ¤Î¤è¤¦¤Ê´Ø·¸¤Ë¤¢¤ëʸ»úÎó¦Á¤È¦Â¤¬Ãê½Ð¤µ¤ì¤¿Ê¸¤Ï, ¹â¡¹1 ʸ¤Ë¸Â¤é¤ì, ¦Á¤È¦Â¤¬¶¦µ¯¤¹¤ë»Ä¤ê¤Îʸ¤Ï, ¤¤¤º¤ì¤â (c) ¤Î¥¿¥¤¥×¤Î¶¦µ¯¤Ç¤¢¤ë. ¤·¤«¤·, ¤³¤Î¾ì¹ç¤Ï, (b) ¤Î¥¿¥¤¥×¤Î¦Á¤È¦Â¤ÏʸÃæ¤Î¶¦µ¯¤È¤Ï¸À¤¨¤Ê¤¤¤«¤é, Ãê½Ð½¸·×¤ÎÂоݤȤʤé¤Ê¤¤.
°Ê¾å¤«¤é, ʸÆâ¤ÎÎ¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½Ð¤Ë¤ª¤¤¤Æ¤Ï, (b) ¤Î¥¿¥¤¥×¤Î¶¦µ¯¤Î¤ß¤òÃê½ÐÂоݳ°¤È¤¹¤ì¤Ð¤è¤¤.
(3) ɽ¸½Í×ÁǤνи½½ç½ø¤Î°·¤¤
Î¥»¶·¿¶¦µ¯É½¸½¤Ç¤Ï, ¤½¤ì¤ò¹½À®¤¹¤ëɽ¸½Í×ÁÇ( ¤³¤³¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯É½¸½¤È¤·¤ÆÃê½Ð¤µ¤ì¤¿Éôʬʸ»úÎó)¤Î½Ð¸½½ç½ø¤Ï°ÕÌ£¤ò»ý¤Ä¤¿¤á, ½Ð¸½½ç½ø¤ò¶èÊ̤·¤ÆÃê½Ð¤·½¸·×¤¹¤ë.
(1) ¥¢¥ë¥´¥ê¥º¥à ( ¿Þ6 »²¾È)
[Á°½àÈ÷]
ºÆ³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¾å¤ÎÏ¢º¿·¿¶¦µ¯É½¸½¤È¤·¤ÆÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Ëʸ»úÎóÈÖ¹æ¤òÉÕÍ¿¤¹¤ë.
¼ê½ç9 : ¡ÖºÆ³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ÎºÆ¥½¡¼¥È¡×
ºÆ³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ò¸¶Ê¸ÈÖÃϤÎÃͤνç¤Ë¥½¡¼¥È¤·, ³ÈÄ¥¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤Î¥ì¥³¡¼¥É½ç¤ËÌ᤹.
¼ê½ç10 : ¡ÖʸÈÖ¹æ¤ÎÉÕÍ¿¡×
ÆÀ¤é¤ì¤¿¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤ËʸÈÖ¹æ¤òµÆþ¤¹¤ë.
¼ê½ç11 : ¡Ö¥Õ¥¡¥¤¥ë¤Î°µ½Ì¡×
¾åµ¥Õ¥¡¥¤¥ë¤ò°Ê²¼¤Î¼ê½ç¤Ç°µ½Ì¤·, ¡ÖÎ¥»¶·¿¶¦µ¯°µ½Ì¥Õ¥¡¥¤¥ë¡×¤òºîÀ®¤¹¤ë(¼¡¤Î¼ê½ç¤ËÈ÷¤¨¤Æ, ÉÔÍפʺî¶ÈÎΰè¤ò³«Êü¤¹¤ë).
¡ | ʸÈÖ¹æ, ʸ»úÎóÈÖ¹æ, Ãê½Ðʸ»ú¿ô, ¸¶Ê¸ÈÖÃϤλͤĤÎÍó°Ê³°¤Ï, ºï½ü¤¹¤ë. | |
¢ | ʸ»úÎóÈÖ¹æ¤ÎÍó¤ÎÃͤ¬¤Ê¤¤¥ì¥³¡¼¥É¤òºï½ü¤¹¤ë. |
¼ê½ç12 : ¡ÖÎ¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½Ð¤È¥«¥¦¥ó¥È¡×
°ìÈ̤Ë, k: ¼ïÎà(k ¡æ2)¤Îʸ»úÎ󤫤é¤Ê¤ëÎ¥»¶·¿¶¦µ¯É½¸½¤òÃê½Ð¤¹¤ë¤â¤Î¤È¤¹¤ë¤È, Ʊ°ì¤ÎʸÆâ¤Ë¤¢¤ëʸ»úÎóÈÖ¹æ¤Îk ¸Ä¤ÎÁȤ߹ç¤ï¤»¤Î¤¹¤Ù¤Æ¤ò (ʸÃæ¤Î½Ð¸½½ç½ø¤Î½ç¤Ë¥»¥Ã¥È¤Ë¤¹¤ë) ¥Õ¥¡¥¤¥ë¤Ë½ñ¤½Ð¤·, ¤½¤ì¤ò¥½¡¼¥È¤·¤Æ, Ʊ°ì¤ÎÁȤοô¤ò¥«¥¦¥ó¥È¤¹¤ë.
°Ê¾å¤Ç, Î¥»¶·¿¶¦µ¯É½¸½¤Î½¸·×ɽ¤¬µá¤á¤é¤ì¤ë. ¤³¤ì¤é¤Îɽ¸½¤ò´Þ¤àʸ¤ò½ÐÎϤ¹¤ë¤Ë¤Ï, ¼ê½ç12 ¤ÇºîÀ®¤¹¤ë³Æɽ¸½¤ÎÁȤËʸÈÖ¹æ¤òÄɵ¤·¤Æ¤ª¤±¤Ð¤è¤¤.
(2) ÎãÂ긡Ƥ
°Ê¾å¤Î¼ê½ç¤ò, ¿Þ4 ¤ÎÎã¤ËŬÍѤ·, Í×ÁÇ¿ô2 ¤ÎÎ¥»¶·¿¶¦µ¯É½¸½¤òµá¤á¤¿. ¤½¤Î·ë²Ì¤ò¿Þ6 ¤Ë¼¨¤¹. ¤³¤ÎÎã¤Ç¤Ï, 3 ¾Ï¤ÇÃê½Ð¤µ¤ì¤¿5 ¼ï¤ÎÏ¢º¿·¿¶¦µ¯Ê¸»úÎó25ÁÈÃæ, 1 ʸÆâ¤ËÎ¥¤ì¤Æ2 ²ó°Ê¾å, ¶¦µ¯¤¹¤ëʸ»úÎó¤ÎÁȤ¬6 ÁȤÇ, ¤½¤ì¤é¤Î±ä¤Ù½Ð¸½²ó¿ô¤Ï12 ²ó¤Ç¤¢¤ë.
¿Þ6 Î¥»¶·¿¶¦µ¯É½¸½Ãê½Ð¥¢¥ë¥´¥ê¥º¥à¼Â»ÜÎã (¿Þ4 ¤Î¼ê½ç7 ¤«¤é³¤¯)
Fig. 6 Example of Interrupted collocational substring extraction (Follows from Fig. 4). 5. ¶¦µ¯É½¸½¤ÎÃê½Ð¼Â¸³ËÜÏÀʸ¤ÇÄó°Æ¤·¤¿ÊýË¡¤Î¸ú²Ì¤ò¸¡¾Ú¤¹¤ë¤¿¤á, ÆüËܸì¥Ç¡¼¥¿¤Ø¤ÎŬÍÑÎã¤È¤·¤Æ, Æü·Ð¿·Ê¹µ»ö3 ¥«·îʬ(892 Ëü»ú) ¤òÂоݤË, Ï¢º¿·¿¶¦µ¯Ê¸»úÎ󤪤è¤ÓÎ¥»¶·¿¶¦µ¯Ê¸»úÎó¤ÎÃê½Ð¼Â¸³¤ò¹Ô¤Ã¤¿. ¤¿¤À¤·, ÆÉÅÀ¤ò½ü¤¯µ¹æÎà¤ò´Þ¤àʸ»úÎó¤ÏÃê½Ð¤ÎÂоݤȤ·¤Ê¤¤¤³¤È¤È¤·¤¿. »ÈÍѤ·¤¿·×»»µ¡¤Ï, XEROX ARGOSS 5270 (SUN OS4.1.3) ¤Ç, »ÈÍѤ·¤¿¥á¥â¥êÎ̤ϺÇÂç48 MB ¤Ç¤¢¤ë. ËܾϤÇ, ÆÀ¤é¤ì¤¿Ê¸»úÎó¤ÎÆÃħ¤È½èÍý»þ´Ö¤Ë¤Ä¤¤¤Æ½Ò¤Ù¤ë. 5.1 Ï¢º¿·¿¶¦µ¯É½¸½¤ÎÃê½Ð(1) Ãê½Ðʸ»úÎó¤ÎÀ¼Á Ãê½Ð¤¹¤ëʸ»ú¿ô¤â¤·¤¯¤Ïʸ»úÎó¤Î½Ð¸½²ó¿ô¤òÀ©¸Â¤·¤¿¾ì¹ç¤Ë, Ãê½Ð¤µ¤ì¤ëʸ»úÎó¤Î¼ïÎà¿ô¤È±ä¤Ù½Ð¸½²ó¿ô¤ò½¾Íè¤ÎÊýË¡¤ÈÈæ³Ó¤·¤Æ, ɽ1, ɽ2 ¤Ë¼¨¤¹. ʸ»úÎó¤ÎŤµ¤«¤é¸«¤¿, Ãê½Ð¤µ¤ì¤ëʸ»úÎó¤Î¼ïÎà¿ô, ½Ð¸½²ó¿ô¤ª¤è¤Óʸ»úÎó¤ÎÎã¤òɽ3 ¤Ë¼¨¤¹. ¤Þ¤¿, ½Ð¸½ÉÑÅ٤ι⤤ʸ»úÎó¤ÎÎã¤òɽ4 ¤Ë¼¨¤¹.
ɽ1 Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Î¼ïÎà¤È±ä¤ÙÅÙ¿ô ( ¤½¤Î1* )
Table 1 Number of extracted substrings and their total frequency (No.1 ).
* ʸ»úÎó¤ÎŤµ¤«¤é¸«¤¿½¸·×
ɽ2 Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Î¼ïÎà¤È±ä¤ÙÅÙ¿ô ( ¤½¤Î2* )
Table 2 Number of extracted substrings and their total frequency (No.2).
* ½Ð¸½ÉÑÅÙ¤«¤é¸«¤¿½¸·×
ɽ3 Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÎã ( ½Ð¸½²ó¿ô¤Î¿¤¤½ç¤Ë·ÇºÜ: ( ) Æâ¤Ï½Ð¸½²ó¿ô)
Table 3 Examples of extracted substrings (in the order of frequency).
ɽ4 ½Ð¸½ÉÑÅ٤ι⤤ʸ»úÎó¤ÎÎã
Table 4 Examples of substrings with high frequency.
¤³¤ì¤é¤Îɽ¤«¤é, °Ê²¼¤Î¤³¤È¤¬Ê¬¤«¤ë. ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, ´üÂÔ¤µ¤ì¤¿¤È¤ª¤ê, ½¾Íè¤ÎÊýË¡¤ËÈæ¤Ù¤Æ, ¿¤¯¤ÎÃÇÊÒŪ¤Êʸ»úÎó¤ÎÃê½Ð¤¬²¡À©¤µ¤ì, Ãê½Ð¤µ¤ì¤ëʸ»úÎó¤Î¼ïÎà, ½Ð¸½²ó¿ô¶¦¤ËÂçÉý¤Ë¸º¾¯¤¹¤ë. Î㤨¤Ð, 2ʸ»ú°Ê¾å, 2²ó°Ê¾å¤Îʸ»úÎó¤Ç¤Ï, Ãê½Ð¤µ¤ì¤ë¼ïÎबÌó5ʬ¤Î1, ±ä¤Ù½Ð¸½²ó¿ô¤âÌó12ʬ¤Î1¤ËÍÞÀ©¤µ¤ì¤ë. ¤³¤Î¸ú²Ì¤Ï, ʸ»ú¿ô¤ÎÂ礤¤Ê¸»úÎó¤Û¤ÉÂ礤¯, 20ʸ»ú°Ê¾å¤Î¾ì¹ç¤Ç¤Ï, Ãê½Ð¤µ¤ì¤ëʸ»úÎó¤Î¼ïÎà, ±ä¤Ù½Ð¸½²ó¿ô¶¦¤Ë, Ìó100ʬ¤Î1¤Ë¤Ê¤ë. (2) ½èÍý»þ´Ö¤Ë¤Ä¤¤¤Æ ºÇ½é¤Îʸ»úÎóñ¸ì¤Î¥½¡¼¥È(ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ÎºîÀ®)¤Ë, ºÇ¤â¿¤¯¤Î»þ´Ö(¥¿¡¼¥ó¥¢¥é¥¦¥ó¥É»þ´Ö: Ìó40»þ´Ö, CPU»þ´Ö: Ìó10»þ´Ö)¤¬¤«¤«¤Ã¤¿¡ù¤¬, ¤½¤Î¸å¤Î½èÍý¤Ï, ¤½¤ì¤ËÈæ¤Ù¤Æ¤¤ï¤á¤Æû»þ´Ö(Ʊ: 34ʬ, Ʊ: 16ʬ)¤Ç¤¢¤Ã¤¿. 5.2 Î¥»¶·¿¶¦µ¯Ê¸»úÎó¤ÎÃê½Ð(1) Ãê½Ðʸ»úÎó¤ÎÀ¼Á ´Êñ¤Î¤¿¤á, ñÆȤǤϤ½¤ì¤¾¤ì10²ó°Ê¾å½Ð¸½¤·¤¿2¼ïÎà¤Îʸ»úÎó¤¬1ʸÆâ¤ËÎ¥¤ì¤Æ¶¦µ¯¤¹¤ë ¾ì¹ç¡ù¤Ë¤Ä¤¤¤Æ, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÁȤοô¤òɽ5¤Ë¼¨¤¹. ¤Þ¤¿, ½Ð¸½ÉÑÅ٤ο¤¤Ê¸»úÎó¤ÎÁȤÈ, 2²ó°Ê¾å½Ð¸½¤·¤¿Ê¸»úÎó¤ÎÁȤÎÃæ¤Ç¹ç·×ʸ»ú¿ô¤Î¿¤¤Ê¸»úÎó¤ÎÁȤò, ¤½¤ì¤¾¤ì, ɽ6, ɽ7¤Ë¼¨¤¹.
ɽ5 Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÁȤμïÎà¤È±ä¤ÙÅÙ¿ô
Table 5 Characteristics of extracted pairs of substrings.
(2¼ïÎà¤Îʸ»úÎó¤ÎʸÆⶦµ¯¤Î¾ì¹ç)
ɽ6 ½Ð¸½ÉÑÅ٤ι⤤ʸ»úÎó¤ÎÁȤÎÎã
Table 6 Pairs of substrings with high frequency.
ɽ7 ¹ç·×ʸ»ú¿ô¤ÎÂ礤¤Ê¸»úÎó¤ÎÎã
Table 7 Pairs of longest substrings.
ɽ6, ɽ7¤«¤é, ½Ð¸½ÉÑÅ٤ι⤤Υ»¶·¿¶¦µ¯¤Î¿¤¯¤Ï, ̾»ìƱ»Î¤Î¶¦µ¯¤Ç¤¢¤ë¤³¤È¤¬Ê¬¤«¤ë. ÆäË, ÏÃÂê¤È¤·¤Ç¿·Ê¹µ»ö¤Ë¼è¤ê¾å¤²¤é¤ì¤¿¸ÇÍ̾»ì¤äÆü»þÅù¤Î¿ôÎ̤Ȥ榵¯¤¬¿ô¿¤¯ ¼è¤ê½Ð¤µ¤ì¤Æ¤¤¤ë¡ù¡ù. ¤³¤Î¤è¤¦¤Ê̾»ì¤Î¶¦µ¯¾ðÊó¤Ï, Î㤨¤Ð, µ¡³£ËÝÌõÍѤμ½ñºîÀ®¤Ê¤É¤Ë±þÍѤǤ¤ë. ¤Þ¤¿, ¥Æ¥ó¥×¥ì¡¼¥ÈËÝÌõ¤Ê¤É¤Ç¤Ï, ̾»ìƱ»Î¤Î¶¦µ¯¤è¤ê¤â¤à¤·¤í, ʸ·¿¥Ñ¥¿¡¼¥ó¤òºî¤ê°×¤¤½õ»ì¤ä½õÆ°»ì¤ò´Þ¤àɽ¸½Í×ÁǤ榵¯¤ò¼ý½¸¤·¤¿¤¤¾ì¹ç¤¬¤¢¤ë. ɽ5¤ò¸«¤ë¤È, Ãê½Ð¤µ¤ì¤¿É½¸½¤ÎÁȤÏ, ¤¹¤Ç¤Ë¤«¤Ê¤ê¹Ê¤ê¹þ¤Þ¤ì¤Æ¤¤¤ë¤¿¤á(Á´ÂΤÇ, 6,544·ï), Á´ÂΤò¿Í¼ê¤Ë¤è¤Ã¤Æ¥Á¥§¥Ã¥¯¤·, ½õ»ì, ½õÆ°»ì¤ò´Þ¤àɽ¸½¤ÎÁȤʤÉ, ÌÜŪ¤Ë±þ¤¸¤¿É½¸½¤ÎÁȤòÁªÂò¤·¤Æ¼è¤ê½Ð¤¹¤³¤È¤Ï¤µ¤Û¤Éº¤Æñ¤Ç¤Ï¤Ê¤¤. ¤·¤«¤·, ¤µ¤é¤ËÂçÎ̤θÀ¸ì¥Ç¡¼¥¿¤Î¾ì¹ç, ½ÐÎϤµ¤ì¤ë¥Ç¡¼¥¿Î̤¬ÁýÂ礷, ¿Í¼ê¤Ë¤è¤ëÁªÂò¤¬º¤Æñ¤È¤Ê¤ë¤³¤È¤¬¹Í¤¨¤é¤ì¤ë. ¤½¤Î¤è¤¦¤Ê¾ì¹ç, ÆÀ¤é¤ì¤¿·ë²Ì¤«¤éÌÜŪ¤Ë¤¢¤ï¤Ê¤¤¤è¤¦¤Êɽ¸½¤òÁªÂòŪ¤Ëºï½ü¤¹¤ëÊýË¡¤â¤¢¤ë¤¬, Ï¢º¿·¿¶¦µ¯, Î¥»¶·¿¶¦µ¯¤ÎÃê½Ð½èÍý¤Î²áÄø¤Ë²ðÆþ¤·¤Æ, Ãê½ÐÂоÝʸ»úÎó¤ËÀ©¸Â¤ò²Ã¤¨¤ë¤³¤È¤â¤Ç¤¤ë. ¤Ê¤ë¤Ù¤¯ÁᤤÃʳ¬¤Ç, Ãê½ÐÂоݤȤ¹¤ëʸ»úÎó¤Î»ú¼ï¹½À®¤ËÀ©Ìó¤ò²Ã¤¨¤¿¤ê, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ò¥Á¥§¥Ã¥¯¤·¤Æ, ÉÔÍפʤâ¤Î¤òÁ°½ü¤·¤¿¤ê¤¹¤ì¤Ð, ¤½¤Î¸å¤Î·×»»Î̤ϸº¾¯¤·, ½ÐÎÏ·ë²Ì¤Î¥Á¥§¥Ã¥¯ºî¶È¤â¸º¾¯¤¹¤ë¡ù¡ù¡ù. ¤³¤³¤Ç¤Ï°ìÎã¤È¤·¤Æ, ¤Ò¤é¤¬¤Êʸ»ú¤ò´Þ¤Þ¤Ê¤¤Ê¸»úÎó¤Èµ¹æ±Ñ¿ô»ú¤ò´Þ¤àʸ»úÎó¤Ï Ãê½Ð¤·¤Ê¤¤¤È¤¤¤¦¾ò·ï¤ÇÆÀ¤é¤ì¤¿Î¥»¶·¿¶¦µ¯¤Î·ë²Ì¤Î °ìÉô¤òɽ8, ɽ9¤Ë¼¨¤¹. ¤³¤Î¾ì¹ç, ¿·Ê¹µ»ö¤Îʸ·¿¤ËÁêÅö¤¹¤ë¤è¤¦¤Ê, Î¥»¶·¿¶¦µ¯É½¸½¤¬Ãê½Ð¤µ¤ì¤ë¤³¤È¤¬Ê¬¤«¤ë.
ɽ8 Ãê½Ð¤µ¤ì¤¿Î¥»¶·¿¶¦µ¯É½¸½¤ÎÎã (( )Æâ¤Ï½Ð¸½²ó¿ô)
Table 8 Interrupted collocational expressions (in the order of frequency).
(¤Ò¤é¤¬¤Ê¤ò´Þ¤ß, µ¹æ±Ñ¿ô»ú¤ò´Þ¤Þ¤Ê¤¤Í×ÁǤòÃê½Ð¤·¤¿·ë²Ì)
ɽ9 ¹ç·×ʸ»ú¿ô¤Î¿¤¤Î¥»¶·¿¶¦µ¯É½¸½¤ÎÎã
Table 9 Interrupted collocational expressions (in the order of total length).
(¤Ò¤é¤¬¤Ê¤ò´Þ¤ß, µ¹æ±Ñ¿ô»ú¤ò´Þ¤Þ¤Ê¤¤Í×ÁǤòÃê½Ð¤·¤¿·ë²Ì)
(2) ¸À¸ì¥Ç¡¼¥¿Î̤ȽèÍý¥µ¥¤¥º¤Ë¤Ä¤¤¤Æ Î¥»¶·¿¶¦µ¯¤Î¾ì¹ç¤Ï, Ï¢º¿·¿¶¦µ¯¤ÇÃê½Ð¤·¤¿É½¸½¤ÎÁȤò°·¤¦¤¿¤á, ɽ¸½¤ÎÁȤò½ñ¤½Ð¤¹¤¿¤á¤Î¥Õ¥¡¥¤¥ë¤ÎÍÆÎ̤¬ÌäÂê¤È¤Ê¤ë¤ÈͽÁÛ¤µ¤ì¤ë. ¤³¤Î¥Õ¥¡¥¤¥ë¤ÎɬÍ×Î̤Ï, Î¥»¶·¿¶¦µ¯¤È¤·¤ÆÀ¸µ¯¤·¤¿É½¸½¤Î¿ô(ÉÑÅÙ1°Ê¾å¤Î±ä¤ÙÅÙ¿ô)¤Ç·è¤Þ¤ë. ¼Â¸³Îã¤Ë¤è¤ì¤Ð, Ï¢º¿·¿¶¦µ¯¤ÇÃê½Ð¤·¤¿É½¸½97Ëü¼ïÎà(±ä¤ÙÅÙ¿ô260Ëü²ó)¤òÂÀÚ¤ê¤ò¤»¤º (¤¿¤À¤·ÅÙ¿ô1¤Î¤â¤Î¤Ï½ü¤¯), ¤½¤Î¤Þ¤Þ»ÈÍѤ·¤ÆÍ×ÁÇ¿ô2¤ÎÎ¥»¶·¿¶¦µ¯¤ò·×»»¤¹¤ë¤È, ÅÙ¿ô2°Ê¾å¤ÎÎ¥»¶·¿É½¸½¤È¤·¤Æ, 18Ëü¼ïÎà(±ä¤ÙÅÙ¿ô40Ëü²ó)¤Îɽ¸½¤ÎÁȤ¬ÆÀ¤é¤ì¤¿. ¤³¤Î¤È¤, ¼ê½ç12¤Ç¥Õ¥¡¥¤¥ë¤Ë½ñ¤½Ð¤µ¤ì¤¿É½¸½¤ÎÁÈ(¤¿¤À¤·Ê¸»úÎóÈÖ¹æ¤Î¥Ú¥¢)¤Ï, 2,000ËüÁȤÇ, ¤½¤ì¤ËÍפ·¤¿¥Õ¥¡¥¤¥ëÎ̤Ï400MB(20¥Ð¥¤¥È/ʸ»úÎó¥Ú¥¢)¤Ç¤¢¤Ã¤¿. ¤³¤ì¤ËÂФ·¤Æ, Ï¢º¿·¿¶¦µ¯¤È¤·¤ÆÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Î¤¦¤Á, ÅÙ¿ô10°Ê¾å¤Î¤â¤Î1.2Ëü¼ïÎà(±ä¤ÙÅÙ¿ôÌó22Ëü²ó)¤ò¼è¤ê¾å¤², ¤½¤ì¤é¤òÍ×ÁǤȤ¹¤ëÎ¥»¶·¿¶¦µ¯É½¸½¤òµá¤á¤¿¾ì¹ç¤Ï, 2ÅÙ¿ô°Ê¾å¤ÎÎ¥»¶·¿¶¦µ¯É½¸½¤È¤·¤Æ, 6,500¼ïÎà(±ä¤ÙÅÙ¿ôÌó2Ëü²ó)¤Îɽ¸½¤¬ÆÀ¤é¤ì¤¿. ¤³¤Î·×»»¤Î²áÄø¤Ç¥Õ¥¡¥¤¥ë¤Ë½ñ¤½Ð¤µ¤ì¤¿É½¸½¤ÎÁȤÏ, Ìó58ËüÁȤÇ, »ÈÍѤ·¤¿¥Õ¥¡¥¤¥ëÎ̤ÏÌó12MB¤Ç¤¢¤ê, ÂÀÚ¤ê¤ò¤·¤Ê¤¤¾ì¹ç¤ËÈæ¤Ù¤Æ, 1/30°Ê²¼¤Ë¸º¾¯¤·¤¿. ¤³¤³¤Ç, ¸À¸ì¥Ç¡¼¥¿Î̤ȽèÍý¥µ¥¤¥º¤Î´Ø·¸¤ò¹Í¤¨¤ë. Ï¢º¿·¿¶¦µ¯¤ÇÃê½Ð¤µ¤ì¤ëɽ¸½¤Î±ä¤ÙÅÙ¿ô¤Ï, ¸À¸ì¥Ç¡¼¥¿Î̤ˤۤÜÈæÎ㤷, Î¥»¶·¿¶¦µ¯¤ÇÃê½Ð¤µ¤ì¤ëɽ¸½¤Î±ä¤ÙÅÙ¿ô¤Ï, Ï¢º¿·¿¶¦µ¯¤ÇÆÀ¤é¤ì¤¿É½¸½¤Î±ä¤ÙÅÙ¿ô¤Î2¾è¤Ë¤Û¤ÜÈæÎ㤹¤ë¤È¹Í¤¨¤é¤ì¤í¤«¤é, Î¥»¶·¿¶¦µ¯½¸·×ÍѤΥե¡¥¤¥ë»ÈÍÑÎ̤Ï, ¸À¸ì¥Ç¡¼¥¿Î̤Î2¾è¤Ë¤Û¤ÜÈæÎ㤹¤ë¤È¿äÄꤵ¤ì¤ë. ¤·¤«¤·, ¸À¸ì¥Ç¡¼¥¿Î̤¬Áý²Ã¤·¤¿¤È¤¤Ï, ¤½¤ì¤ËÈæÎ㤷¤ÆÏ¢º¿·¿¶¦µ¯É½¸½¤ÎÂÀÚ¤êÃͤò¾å¤²¤Æ¤âÃê½ÐÀºÅÙ¤ÏÄã²¼¤»¤º, ½ÅÍפÊ(ÉÑÅ٤ι⤤)ɽ¸½¤Ïϳ¤ì¤Ê¤¯¼ý½¸¤Ç¤¤ë¤È ´üÂÔ¤µ¤ì¤ë¡ù. ¤½¤³¤Ç, ɽ2¤ò¸«¤ë¤È, ÂÀÚ¤êÃͤˤۤÜÈ¿ÈæÎ㤷¤Æ, Ãê½Ð¤µ¤ì¤ëÏ¢º¿·¿¶¦µ¯É½¸½¤Î±ä¤ÙÅÙ¿ô¤Ï¸º¾¯¤·¤Æ¤¤¤ë¤³¤È¤¬Ê¬¤«¤ë. ¤³¤ì¤é¤ÎÅÀ¤«¤é, ¸¶Ê¸¥Ç¡¼¥¿Î̤¬Áý²Ã¤·¤¿¤È¤¤Ï, ÂÀÚ¤êÃͤò¤½¤ì¤ËÈæÎ㤷¤Æ¾å¤²¤ë¤³¤È¤Ë¤è¤ê, Ãê½ÐÀºÅÙ¤òÄã²¼¤µ¤»¤Ê¤¤¤ÇÎ¥»¶·¿¶¦µ¯¤Î·×»»¤¬¤Ç¤, ¤½¤Î¤È¤, ·×»»¤ËɬÍפȤµ¤ì¤ë¥Õ¥¡¥¤¥ëÎ̤ÎÁý²Ã¤Ï, ¸À¸ì¥Ç¡¼¥¿¤ÎÁý²Ã¤ËÈæÎ㤹¤ë¥ª¡¼¥À¤ËÍÞ¤¨¤é¤ì¤ë¤È´üÂԤǤ¤ë. 5.3 º£¸å¤Î²þÎɤȱþÍѤˤĤ¤¤Æ(1) ÌÜŪ¤Ë¹ç¤ï¤»¤¿Ãê½Ðʸ»úÎó¼ïÊ̤λØÄê Î¥»¶·¿¤Î¶¦µ¯É½¸½Ãê½Ð¤Î¾ì¹ç, ·×»»²Äǽ¤Ê¸À¸ì¥Ç¡¼¥¿Î̤òÁýÂ礵¤»¤ë¤¿¤á¤Ë¤Ï, ÆäË, ¤½¤ì¤Ë»ÈÍѤ¹¤ëÏ¢º¿·¿Ê¸»úÎó¤Î¼ïÎà¤ò¾¯¤·¤Ç¤â¸º¾¯¤µ¤»¤ë¤³¤È¤¬Ë¾¤Þ¤ì¤ë. ¤³¤ì¤ËÂФ·¤Æ, ¼Â¸³Îã¤ÇÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Ë¤Ï, ¤Þ¤À, ÍÍ¡¹¤Ê¼ïÎà¤Îʸ»úÎ󤬺®¤¶¤Ã¤Æ¤¤¤ë. ÆüËܸì¥Ç¡¼¥¿¤Î¾ì¹ç, Î㤨¤Ð,
(2) ÍÞÀ©¤µ¤ì¤¿Ê¸»úÎ󥫥¦¥ó¥È¤Î°ìÉôÉü³è ËÜÏÀʸ¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯¤Î·×»»¤Ë¤ª¤¤¤Æ, °ìÅÙÃê½Ð¤·¤¿Ê¸»úÎóÆâ¤ÎÉôʬʸ»úÎó¤ÎÃê½Ð¤Ï¥À¥Ö¥ë¥«¥¦¥ó¥È¤Ë¤Ê¤ë¤È¹Í¤¨, ¾ò·ï3(ºÇĹ°ìÃפΤâ¤Î¤Î¤ßÃê½Ð)¤òÁ°Äó¤È¤·¤¿. ¤³¤Î¤¿¤á, Ãê½Ð¤µ¤ì¤ëʸ»úÎó¤Ï, ÆÈΩÀ¤¬¤¢¤ê, Ï¢º¿¶¦µ¯¤È¤ß¤Ê¤»¤ëʸ»úÎó¤Ë¹Ê¤é¤ì¤Æ¤¤¤ë. ¤·¤«¤·, ¤è¤êºÙ¤«¤¤Í×ÁǤ«¤é¤Ê¤ëÎ¥»¶·¿¶¦µ¯¤ò¤â¼ý½¸¤·¤è¤¦¤È¤¹¤ë¾ì¹ç¤Ï, °ìÅÙÃê½Ð¤·¤¿Ê¸»úÎó¤ÎÃæ¤ÎÍ×ÁǤ«¤é¤â, Í×ÁÇŪ¤Êɽ¸½¤òÃê½Ð¤¹¤ì¤Ð¤è¤¤. ÃÇÊÒŪ¤Êʸ»úÎó¤ÎÃê½Ð¤òÍÞÀ©¤·¤Ê¤¬¤é, ¤³¤ì¤é¤ÎÍ×ÁÇŪɽ¸½¤òÃê½Ð¤¹¤ë¤Ë¤Ï, ¿Þ1¤Ç, ʸ»úÎó¦Á¤ÎÃæ¤Ë´Þ¤Þ¤ì¤ëÉôʬʸ»úÎó¤Î¦Â¤ä¦Ã¤â, ¤½¤Îʸ»úÎ󤬸¶Ê¸Ãæ¤Î¾¤ÎÉôʬ¤ËÀ¸µ¯¤·¤ÆÃê½ÐÂоݤȤʤ俤Ȥ¤Ï, ¥«¥¦¥ó¥È¤Ë²Ã¤¨¤ë¤ÈÎɤ¤. ¶ñÂÎŪ¤Ë¤Ï, 3¾Ï¤Î¥¢¥ë¥´¥ê¥º¥à¤Î¼ê½ç8¤Ç, n-gram ¤Îʸ»úÎó¤òÃê½Ð¤¹¤ëºÝ, ¤½¤Î¥ì¥³¡¼¥É¤Î¾åÊý¸þ¤ËϢ³¤¹¤ë¥ì¥³¡¼¥É¤Ç, Ãê½Ðʸ»ú¿ô¤ÎÃͤ¬n+1 °Ê¾å¤Î¤â¤Î¤ân-gram ¤ÎÃê½ÐÂоݤ˲䨤ì¤Ð¤è¤¤. ¤½¤ÎºÝ, ¿·¤¿¤ËÃê½ÐÂоݤȤʤ俥쥳¡¼¥É(½ÅÊ£Ãê½Ð¤ÎÂоݥ쥳¡¼¥É)¤ò¥³¥Ô¡¼¤·¤ÆÄɲ䷤Ƥª¤±¤Ð, Î¥»¶·¿¶¦µ¯¤Î·×»»½èÍý¤Î¼êľ¤·¤ÏÉÔÍפȤʤë. (3) Î¥»¶·¿¶¦µ¯É½¸½Ãê½Ð¤Ë¤ª¤±¤ë1ʸ»úÍ×ÁǤΰ·¤¤ ËÜÏÀʸ¤Î¼Â¸³Îã¤Ç¤Ï, ·×»»Î̤ò¸º¾¯¤µ¤»¤ë¤¿¤á, Ãê½ÐÂоÝʸ»úÎó¤Îʸ»ú¿ô¤Ï2ʸ»ú°Ê¾å¤Ç¤¢¤ë¤È¤·¤¿. ¤·¤«¤·, Î¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½Ð¤Ë¤ª¤¤¤Æ, ÆüËÜʸ¤Îʸ·¿¤òÃê½Ð¤·¤¿¤¤¤è¤¦¤Ê¾ì¹ç, ¡Ö¡Á¤¬¡Á¤ò¡Á¤Ë¡Á¡×¤Ê¤É¤Î¤è¤¦¤Ë, ʸÃ椫¤é1ʸ»ú¥¡¼¥ï¡¼¥É¤ÎÁȤòõ¤·¤¿¤¤¾ì¹ç¤¬¤¢¤ë. ¤³¤Î¤è¤¦¤Ê¾ì¹ç¤Ï, ¸å¤Ë½Ò¤Ù¤ë¤è¤¦¤Ë, ·ÁÂÖÁDzòÀÏ·ë²Ì¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤òŬÍѤ¹¤ì¤Ð¤è¤¤¤È¹Í¤¨¤é¤ì¤ë¤¬, (1)¤Ç½Ò¤Ù¤¿ÊýË¡¤Ê¤É¤Ë¤è¤ê, Ãê½ÐÂоݤò¹Ê¤ê¹þ¤à¤³¤È¤Ë¤è¤Ã¤Æ·×»»Î̤ò¸º¤é¤·, Ãê½Ð¤ò²Äǽ¤È¤¹¤ë¤³¤È¤â¹Í¤¨¤é¤ì¤ë. (4) ·ÁÂÖÁÇÎó, ñ¸ìÎóÅù¤Ø¤ÎŬÍÑ ÆüËܸì¤Îʸ·¿¤òÃê½Ð¤¹¤ë¤Ë¤Ï, ¸À¸ì¥Ç¡¼¥¿¤ò·ÁÂÖÁDzòÀϤ·¤ÆÆÀ¤é¤ì¤¿Ã±¸ì¤ÎʸˡŪ°À¤ä°Ọ̃°À¤òɽ¤¹µ¹æÎó¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤òŬÍѤ¹¤ë¤³¤È¤¬´üÂÔ¤µ¤ì¤ë. ʸˡŪ, °Ọ̃Ū¤Ë¸«¤Æ¤É¤Î¤è¤¦¤Ê¼ïÎà¤Îʸ·¿¾ðÊó¤¬ÆÀ¤é¤ì¤ë¤«, ¤Þ¤¿, ñ¸ì¶¦µ¯¾ðÊó¤òÆÀ¤ë¾ì¹ç, ʸ»úÏ¢º¿¤ËŬÍѤ¹¤ëÊýË¡¤È, ñ¸ìÎó¤ËŬÍѤ¹¤ëÊýË¡¤Î¤É¤Á¤é¤¬¤è¤¤¤«¤Ê¤É, º£¸å¤Î²ÝÂê¤Ç¤¢¤ë. 6. ¤¢¤È¤¬¤¸À¸ì¥³¡¼¥Ñ¥¹¤Ê¤É¤ÎËÄÂç¤Ê¸À¸ì¥Ç¡¼¥¿¤ÎÃ椫¤é, »ÈÍÑÉÑÅ٤ι⤤ɽ¸½¤ª¤è¤Óɽ¸½¤ÎÁȤò¼«Æ°Åª¤Ëȯ¸«¤·½¸·×¤¹¤ëÊýË¡¤òÄó°Æ¤·¤¿. ¶ñÂÎŪ¤Ë¤Ï, ¤Þ¤º, Ǥ°Õ¤În-gram ¤Î·×»»Ë¡¤È¤·¤ÆÄó°Æ¤µ¤ì¤¿Ä¹Èø¤é¤Î¥¢¥ë¥´¥ê¥º¥à¤ò ÆÈΩÀ¤Î¹â¤¤É½¸½¤òÃê½Ð¤¹¤ë´ÑÅÀ¤«¤é²þÎɤ·, ¸À¸ì¥Ç¡¼¥¿¤ÎÃæ¤Ë2²ó°Ê¾å½Ð¸½¤·¤¿Ê¸»úÎó(Ï¢º¿·¿¶¦µ¯É½¸½)¤ò, ¡Ö°ìÅÙ, Ãê½Ð¤·¤¿Ê¸»úÎó¤ÎÉôʬʸ»úÎó¤Ï, ¤½¤Î¸å, Ãê½ÐÂоݤȤ·¤Ê¤¤¡×¤È¤¤¤¦¾ò·ï²¼¤Ç, ϳ¤ì¤Ê¤¯¼«Æ°Åª¤ËÃê½Ð¤·½¸·×¤¹¤ëÊýË¡¤òÄó°Æ¤·¤¿. ¼¡¤Ë, ¤³¤ÎÊýË¡¤ÇÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤òÁȤ߹ç¤ï¤»¤Æ, ʸÃæ¤ÎÎ¥¤ì¤¿°ÌÃ֤˶¦µ¯¤¹¤ëʸ»úÎó¤ÎÁÈ(Î¥»¶·¿¶¦µ¯É½¸½)¤òÃê½Ð¤·, ¤½¤ÎÉÑÅÙ¤òµá¤á¤ëÊýË¡¤ò¼¨¤·¤¿. 3¥«·îʬ¤Î¿·Ê¹µ»ö¥Ç¡¼¥¿(892Ëü»ú)¤ËŬÍѤ·¤¿Îã¤Ë¤è¤ì¤Ð, Ï¢º¿·¿¶¦µ¯É½¸½Ãê½Ð¤Î¾ì¹ç, ½¾Íè¤ÎÊýË¡¤Ç¤Ï, 2ʸ»ú°Ê¾å, 2ÅÙ¿ô°Ê¾å¤Îʸ»úÎó¤¬, 440Ëü¼ïÎà, ±ä¤Ù3,120Ëü²ó¤Îʸ»úÎó¤¬Ãê½Ð¤µ¤ì¤¿¤Î¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, 97Ëü¼ïÎà, ±ä¤Ù260Ëü·ï¤Ë¸º¾¯¤·¤¿. Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤òÈæ³Ó¤·¤¿·ë²Ì, n-gram ¤ÎÊýË¡¤ÇÆÀ¤é¤ì¤¿Ê¸»úÎó¤¬, ËÄÂç¤ÊÎ̤ÎÃÇÊÒŪ¤Êʸ»úÎó(ʸˡŪ, °Ọ̃Ū¤Ë°ÕÌ£¤Î¤Ê¤¤Ê¸»úÎó)¤ò´Þ¤à¤Î¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, ¤½¤ì¤é¤ÎÃÇÊÒŪ¤Êʸ»úÎó¤¬ÂçÉý¤Ëºï½ü¤µ¤ì¤ë¤³¤È¤¬³Îǧ¤µ¤ì¤¿. ¤³¤Î¸ú²Ì¤Ë¤è¤ê, Î¥»¶·¿¤Î¶¦µ¯É½¸½¤ÎÌÖÍåŪ¤Ê¼«Æ°Ãê½Ð¤¬²Äǽ¤È¤Ê¤Ã¤¿. Äó°Æ¤·¤¿Î¥»¶·¿¶¦µ¯É½¸½Ãê½ÐÊý¼°¤ÎŬÍÑÎã¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯¤Î½¸·×¤ÇÆÀ¤é¤ì¤¿Ê¸»úÎó¤Î¤¦¤Á, 10²ó°Ê¾å½Ð¸½¤·¤¿Ê¸»úÎó(12,350¼ïÎà)¤ÎǤ°Õ¤Î2¼ïÎब, 1ʸÃæ¤Ë2²ó°Ê¾å¶¦µ¯¤·¤¿É½¸½¤ÎÁȤÏ, 6,500¼ïÎà(±ä¤Ù½Ð¸½²ó¿ô21,800²ó)¤Ç¤¢¤ë¤³¤È¤Ê¤É, Î¥»¶·¿¤Î¶¦µ¯É½¸½¤¬Íưפ˵á¤á¤é¤ì¤ë¤³¤È¤¬Ê¬¤«¤Ã¤¿. °Ê¾å¤Î¤È¤ª¤ê, ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯É½¸½Ãê½Ð¤Ç¤ÎÃÇÊÒŪʸ»úÎó¤ÎÃê½Ð¤¬ÍÞÀ©¤µ¤ì¤ë·ë²Ì, Î¥»¶·¿¶¦µ¯É½¸½¤òÍưפ˷׻»¤¹¤ë¤³¤È¤¬²Äǽ¤È¤Ê¤ê, ʸ·¿¥Ñ¥¿¡¼¥ó¤Ê¤É, ʸ¹½Â¤¤Ë´Ø¤¹¤ë´ðÁåǡ¼¥¿¤ò, ¤Û¤Ü¼«Æ°Åª¤Ë¼ý½¸¤¹¤ë¤³¤È¤¬²Äǽ¤È¤Ê¤Ã¤¿. ËÜÏÀʸ¤Ç¤Ï, ŬÍÑÎã¤È¤·¤ÆÆüËܸìʸ»úÎó¥Ç¡¼¥¿¤«¤é¤ÎÃê½Ð·ë²Ì¤ò¼¨¤·¤¿¤¬, ¤³¤ÎÊýË¡¤Ï, Ǥ°Õ¤Îµ¹æÎó¤ËŬÍѤǤ¤ë¤¿¤á, ñ¸ìñ°Ì¤Ëʬ³ä¤·¤¿Ã±¸ìÎó¤ä·ÁÂÖÁDzòÀϤηë²Ì¤È¤·¤ÆÆÀ¤é¤ì¤¿Ê¸Ë¡ÅªÍ×ÁÇÎó, ¤â¤·¤¯¤Ï, ³Æñ¸ì¤ò¤½¤Î°Ọ̃°À¤ÇÃÖ¤´¹¤¨¤¿°Ọ̃°ÀÏ¢º¿¤Ê¤É, ¼ï¡¹¤Î±þÍѤ¬²Äǽ¤Ç¤¢¤ë¡ù. º£¸å¤Ï, ¼ï¡¹¤ÎŬÍѼ¸³¤ò¹Ô¤¤, ¤µ¤é¤Ë²þÎɤò²Ã¤¨¤Æ¤¤¤¯¤Ä¤â¤ê¤Ç¤¢¤ë. »²¹Íʸ¸¥
(Ê¿À®7ǯ3·î31Æü¼õÉÕ)
(Ê¿À®7ǯ9·î6ÆüºÎÏ¿)
Footnote ¡ù ËÜÏÀʸ¤Ç¤Ï, ʸˡŪ, °Ọ̃Ū¤Ëɽ¸½¤Îñ°Ì¤È¤ß¤Ê¤»¤ëʸ±§Îó¤ò°Õ¼±¤·¤Æ¡Öɽ¸½¡×¤È¸Æ¤Ö. (Return) ¡ù ¤¢¤ëɽ¸½¤ÎÉôʬ¤È¤·¤Æ¤·¤«»ÈÍѤµ¤ì¤Ê¤¤¤è¤¦¤ÊÉôʬŪ¤Êɽ¸½¤Ï Ãê½Ð¤µ¤ì¤Ê¤¤¤¬, ¤½¤Î¤è¤¦¤Êɽ¸½¤Ï, ¸µ¡¹, ¤½¤ì¤ò´Þ¤à¤è¤êÂç¤ ¤Êɽ¸½¤Î°ìÉô¤Ë¤¹¤®¤Ê¤¤¤È¹Í¤¨¤é¤ì¤ë¤«¤é, ²þ¤á¤Æ¼è¤ê½Ð¤¹¤³ ¤È¤Ï¤·¤Ê¤¤. (Return) ¡ù ½¾Íè, ñ¸ìÎó¤Î¾ì¹ç, ·ë²Ì¤«¤é·×»»¤¹¤ëÊýË¡¤¬»È¤ï¤ì¤Æ¤¤¤¿Îã3) ¤¬¤¢¤ë. ¤·¤«¤·, ¤½¤ÎÊýË¡¤Ç¤Ï, ¸¶Ê¸¤Î¤¢¤ë°ìÄêÎΰ褫¤é, ¸ß ¤¤¤ËÉôʬʸ»úÎó¤ò¶¦Í¤¹¤ë¤è¤¦¤ÊÊ£¿ô¤Îʸ»úÎó¤¬Ãê½Ð¤µ¤ì¤Æ¤¤ ¤ë¤È¤, °ú¤²á¤®¤¬À¸¤¸¤ë. Ãê½Ð¤Î½ª¤ï¤Ã¤¿Ãʳ¬¤Ç¤Ï, °ú¤²á ¤®¤ÎÍ̵¤ÎȽÃǤϲ¼Ç½¤Ê¤¿¤á, Àµ³Î¤Ê·×»»¤Ï¤Ç¤¤Ê¤¤. (Return) ¡ù ÈÆÍÑ ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤Ë, ¼¡Ã±¸ìÈÖÃÏ(next pointer) ¤Î¥Õ¥£¡¼¥ë¥É¤òÀߤ±¤ì¤Ð, ¥é¥ó¥À¥à¥¢¥¯¥»¥¹¤Ë¤è¤Ã¤Æ¤¿¤É¤ì¤ë. ¤·¤«¤·, Ä̾ï¥Õ¥¡¥¤¥ë¥µ¥¤¥º¤ÏÂ礤¯, ¥Ç¥£¥¹¥¯¥¢¥¯¥»¥¹²ó¿ô ¤¬ËÄÂç (4 ¾Ï¤Î¼Â¸³Îã¤Ç¤Ï, Á´¥ì¥³¡¼¥É¤Ë1 ²ó¤º¤Ä¥é¥ó¥À¥à¤Ë ¥¢¥¯¥»¥¹¤¹¤ë»þ´Ö¤Ï, 1,000 Ëü²ó ¡ß 10 ms ¡á 30»þ´ÖÄøÅ٤ȿä Äꤵ¤ì¤ë) ¤È¤Ê¤ë. ¤³¤ì¤ËÂФ·¤ÆϢ³¤·¤¿¥ì¥³¡¼¥É¤Î½èÍý (IO ¥Ð¥Ã¥Õ¥¡¤Î¥µ¥¤¥º¤Ë¤â¤è¤ë¤¬) ¤Ï¹â®¤Ç¤¢¤ë. ¤½¤Î¤è¤¦¤Ë¤¹¤ë¤Ë ¤Ï, ËÜʸ¤Ç½Ò¤Ù¤¿¤è¤¦¤Ë¸¶Ê¸ÈÖÃϽç¤Ë¥½¡¼¥È¤·Ä¾¤¹É¬Íפ¬¤¢¤ë ¤¬, ¤½¤Î¤¿¤á¤Î¥½¡¼¥È»þ´Ö¤Ï, ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Ë¼¡Ã±¸ì¹áÃϤò õ¤·¤Æ½ñ¤¹þ¤à½èÍý¤ÈƱÅù¤Î»þ´Ö¤Ç¼Â¹Ô¤Ç¤¤ë. °Ê¾å¤«¤é, ¥é ¥ó¥À¥à¥¢¥¯¥»¥¹¤ËÈæ¤Ù¤Æ, ¥Í¥¯¥¹¥È¥µ¡¼¥Á¤ÎÊý¤¬¹â®¤À¤È´üÂÔ ¤µ¤ì¤ë. ¤Ê¤ª, Ʊ¼ï¤ÎºÆ¥½¡¼¥È½èÍý¤Ï, Ï¢º¿·¿¶¦µ¯Ê¸»úÎóÃê½Ð ¤Ç2 ²ó, Î¥»¶·¿¶¦µ¯¤Ç1 ²ó¤Î¹ç·×3 ²óɬÍפȤʤ뤬, ¤¤¤º¤ì¤â, ½çÉÔƱ¤È¤Ê¤Ã¤¿¥ì¥³¡¼¥ÉÈֹ椬¸µ¤ÎϢ³ÈÖ¹æ¤Ë¤Ê¤ë¤è¤¦¤Ë¥½¡¼ ¥È¤·Ä¾¤¹¤â¤Î¤Ç¤¢¤ê, ñ½ã¤Ç¹â®¤Ë¼Â¹Ô¤Ç¤¤ë (4 ¾Ï¤ÎÎã¤Ç¤Ï, ¥½¡¼¥È 1 ²óÅö¤¿¤ê¿ôʬ¤Ç¤¢¤ë). (Return) ¡ù ĹÈø¡¦ ¿¹¤Î¥¢¥ë¥´¥ê¥º¥à¤ò»ÈÍѤ¹¤ëÉôʬ¤Ç, Ëܼ¸³¤Ç¤Ï, Ä̾ï¤Î ¥³¥à¥½¡¼¥È¤ò»ÈÍѤ·¤¿¤¬, ´ÖÀÜ¥¢¥É¥ì¥¹¤Ë¤è¤ë¥½¡¼¥È¤Ç¤¢¤ë¤¿ ¤á»þ´Ö¤¬¤«¤«¤Ã¤¿. ¤³¤ì¤ò¹â®²½¤¹¤ë¤¿¤á¤Ë¤Ï, ÀèƬ1 ¡Á2 ʸ»ú ¤ò¥á¥â¥ê¾å¤Ë¼è¤ê½Ð¤·, ľÀÜ¥½¡¼¥È¤·¤¿¸å, Éôʬ¥½¡¼¥È¤ò·«¤ê ÊÖ¤¹¤Ê¤É¤ÎÊýË¡¤¬¹Í¤¨¤é¤ì¤ë. (Return) ¡ù 2 ¼ïÎà¤Îɽ¸½¤ÎÁȤν¸·×¤Ç¤Ï, ÂÀÚ¤ê¤ò¤·¤Ê¤¤ (2 ÅÙ¿ô°Ê¾å¤òÂÐ ¾Ý¤È¤¹¤ë) ¾ì¹ç, Ìó18 Ëü¼ïÎà(±ä¥Ù40 ËüÅÙ¿ô) ¤ÎÎ¥»¶·¿É½¸½ ¤¬Ãê½Ð¤µ¤ì¤¿. ¤³¤³¤Ç¤Ï, ·ë²Ì¤ò¸«¤ä¤¹¤¯¤¹¤ë¤¿¤á, Ãê½Ð¤µ¤ì ¤ë¼ïÎबÌó1Ëü·ï°Ê²¼¤Ë¤Ê¤ë¤è¤¦¤Ë, ñÆȽи½²ó¿ô10 ¤Ç, ÆþÎÏ ¤ÎÂÀÚ¤ê¤ò¤·¤¿¾ì¹ç¤ò¼¨¤¹. (Return) ¡ù¡ù ɽ6 ¤Ç¤Ï¡Ö¥¼¥Í¥é¥ë¡×+ ¡Ö¥â¡¼¥¿¡¼¥¹¡×, ¡Ö¥µ¥ß¥Ã¥È¡×+ ¡ÖÀè¿Ê¹ñ ¼óǾ²ñµÄ¡× ¤Ê¤É¤Î¥Ú¥¢¤«ÉѽФ·¤Æ¤¤¤ë¤¬, ¤³¤ì¤Ï, ËÜʸÃæ¤Ë¤Ï ¡Ö¥¼¥Í¥é¥ë¡¦ ¥â¡¼¥¿¡¼¥¹¡×, ¡Ö¥µ¥ß¥Ã¥È (Àè¿Ê¹ñ¼óǾ²ñµÄ)¡×¤Ê¤É¤È ¤·¤Æ½Ð¸½¤·¤Æ¤¤¤¿¤¿¤á¤Ç, ÆÉÅÀ¤ò½ü¤¯µ¹æÎà¤ÏÏ¢º¿·¿¶¦µ¯¤Î½¸ ·×¤ÎÂоݤȤ·¤Ê¤«¤Ã¤¿¤¿¤á¤Ç¤¢¤ë. (Return) ¡ù¡ù¡ù Î㤨¤Ð, Ï¢º¿·¿¤ÇÃê½Ð¤·¤¿Ê¸»úÎó¤Î10%¤¬Í¸ú¤Êɽ¸½¤À¤Ã¤¿ ¤È¤¹¤ë¤È, Î¥»¶·¿¤Î¾ì¹ç¤ËÃê½Ð¤µ¤ì¤ëʸ»úÎó¤ÎÁȤÎ͸ú¤Ê¤â¤Î ¤Ï, 0.1 ¤În ¾è (n ¤ÏÍ×ÁǤȤ¹¤ëɽ¸½¤Î¿ô) °Ê²¼¤Ë¸º¾¯¤¹¤ë ¤È¹Í¤¨¤é¤ì¤ë. ¤·¤¿¤¬¤Ã¤Æ, Ï¢º¿·¿¤ËÈæ¤Ù¤ÆÎ¥»¶·¿¤Ç¤Ï¤µ¤é¤Ë, Ãê½Ð¤·¤¿¤¤É½¸½¤ò¤¤¤«¤Ë¹Ê¤ê¹þ¤à¤«¤¬½ÅÍפÊÌäÂê¤È¤Ê¤ë. ¼Â¸³ ¤Ë¤è¤ì¤Ð, Ãê½Ð¤·¤¿¤¤É½¸½¤ò»ú¼ï¤Ë¤è¤Ã¤ÆÀ©Ì󤹤ë¸ú²Ì¤¬Âç¤ ¤¤¤¬, ¤³¤ÎÅÀ¤Ï, ¤µ¤é¤Ëº£¸å¤Î¸¡Æ¤¤¬É¬ÍפǤ¢¤ë. (Return) ¡ù ¶¦µ¯É½¸½¤ÎÃê½Ð¤Ç¤Ï, ½Ð¸½ÉÑÅ٤ι⤤ɽ¸½¤ò¤¤¤«¤Ë¤â¤ì¤Ê¤¯½¦ ¤¤½Ð¤¹¤«¤¬ÌäÂê¤Ç¤¢¤ë. ½Ð¸½¤¹¤ëɽ¸½¤ÎʬÉÛ¤ËÂ礤ÊÊФê¤Î¤Ê ¤¤É¸ËܤǤ¢¤ì¤Ð, ɸËÜÎ̤òÁý²Ã¤µ¤»¤¿¤È¤, ¤½¤ì¤Ë¤Ä¤ì¤Æ½Ð¸½ ÉÑÅ٤ι⤤ɽ¸½¤Î½Ð¸½²ó¿ô¤âÁý²Ã¤¹¤ë¤«¤é, ŬÅö¤ÊÃͤÇÂÀÚ¤ê ¤ò¤·¤Æ¤â¤½¤ì¤é¤òϳ ¤é¤¹¿´ÇۤϾ¯¤Ê¤¤. ¤Ê¤ª, ɽ¸½¤ËÂ礤ÊÊÐ ¤ê¤Î¤¢¤ëɸËܤξì¹ç¤Ï, ¥¸¥ã¥ó¥ë¤´¤È¤Ëʬ¤±¤Æ, ¶¦µ¯É½¸½¤ò¼ý ½¸¤¹¤ëÊý¤¬Å¬ÀڤȸÀ¤¨¤ë. (Return) ¡ù Ãøºî¸¢¥Á¥§¥Ã¥¯¤Î¤¿¤á¤ÎÃøºîʪ¤Î¾È¹ç,°äÅÁ»Ò¹©³Ø¤Ë¤ª¤±¤ëDNA ®º¿¤Î¥Á¥§¥Ã¥¯¤Ê¤É¤Ø¤Î±þÍѤâ´üÂÔ¤µ¤ì¤ë. ¸½ºß¥×¥í¥°¥é¥à¤Î ¥Ñ¥Ã¥±¡¼¥¸²½¤òͽÄꤷ¤Æ¤¤¤ë¤Î¤Ç, ¤´´õ˾¤ÎÊý¤ÏÃø¼Ô¤Þ¤Ç¤´Ï¢ Íí¤¯¤À¤µ¤¤. ({ikehara, shirai}@nttkb.ntt.jp) (Return) |