µ¡³£ËÝÂôÅù¤Î¼«Á³¸À¸ì½èÍý¤ËɬÍפÊ, »ÈÍÑÉÑÅ٤ι⤤ɽ¸½¤äÌÌÄêŪ¤Ê¸À¤¤²ó¤·¤Ê¤É¤Îɽ¸½¤òÃê½Ð¤¹¤ë¤¿¤á, ÂçÎ̤θÀ¸ì¥Ç¡¼¥¿¤òÂоݤË, Ï¢º¿·¿¤ª¤è¤ÓÎ¥»¶·¿¤Î¶¦µ¯É½¸½¤ò¸úΨ¤è¤¯¼«Æ°Åª¤ËÃê½Ð¤¹¤ë¥¢¥ë¥´¥ê¥º¥à¤òÄ󰯤·¤¿. Ï¢º¿·¿¶¦µ¯É½¸½¤ÎÍ®½Ð¤Ç¤Ï, ºÇ¶áÄ󰯤µ¤ì¤¿n-gramÅý·×¤ÎÊýË¡¤¬»ÈÍѤǤ¤ë¤¬, ËÄÂç¤ÊÎ̤ÎÃÇÊÒŪ¤Êʸ»úÎó¤¬Ãê½Ð¤µ¤ì¤ë¤¿¤á, ¤½¤Î¹Ê¤ê¹þ¤ß¤¬ÌäÂê¤Ç¤¢¤Ã¤¿. ¤Þ¤¿, Î¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½Ð¤Ç¤Ï, ŬÀÚ¤ÊÊýË¡¤¬¤Ê¤«¤Ã¤¿. ¤½¤³¤Ç, ËÜÏÀʸ¤Ç¤Ï, ¤Þ¤º, Ï¢º¿·¿¶¦µ¯É½¸½¤ËÂФ·¤Æ, ÃÇÊÒŪ¤Êʸ»úÎó¤ÎÃê½Ð¤òÂçÉý¤ËÍÞÀ©¤·¤Ê¤¬¤é, Ǥ°Õ¤ÎŤµ°Ê¾å¤Ç, Ǥ°Õ¤Î½Ð¸½²ó¿ô°Ê¾å¤Îʸ»úÎó¤òÃê½Ð¤¹¤ë¥¢¥ë¥´¥ê¥º¥à¤òÄ󰯤·¤¿. ¼¡¤Ë, ¤³¤ì¤Ë¤è¤Ã¤ÆÆÀ¤é¤ì¤¿Ï¢º¿·¿¤Î¶¦µ¯É½¸½¤òÁȤ߹ç¤ï¤»¤Æ, Î¥»¶·¿¤Î¶¦µ¯É½¸½¤ò¼«Æ°Åª¤Ëϳ¤ì¤Ê¤¯Ãê½Ð¤¹¤ëÊýË¡¤òÄ󰯤·¤¿. 3¥«·îʬ¤Î¿·Ê¹µ»ö¥Ç¡¼¥¿(892Ëü»ú)¤òÂоݤȤ·¤¿¼Â¸³¤ÎÎã¤Ë¤è¤ì¤Ð, Ï¢º¿·¿¶¦µ¯É½¸½¤Î¾ì¹ç, ʸ»úÎóĹ2 ʸ»ú°Ê¾å, ½Ð¸½ÉÑÅÙ2²ó°Ê¾å¤ÇÃê½Ð¤µ¤ì¤ëɽ¸½¤Î¼ïÎà¤Ï, n-gram¤ÎÊýË¡¤Ç¤Ï, 440Ëü¼ïÎà(±ä¤Ù½Ð¸½²ó¿ô3,120Ëü²ó)¤Ç¤¢¤Ã¤¿¤Î¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, 97Ëü¼ïÎà(±ä¤Ù½Ð¸½²ó¿ô260Ëü²ó)¤È¤Ê¤ê, ÃÇÊÒŪ¤Êɽ¸½¤ÏÂçÉý¤Ë¸º¾¯¤·¤¿. ¤Þ¤¿, ¿·¤¿¤ËÄ󰯤·¤¿Î¥»¶·¿¶¦µ¯É½¸½Ãê½ÐÊý¼°¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯¤ÎÃê½Ð¤ÇÆÀ¤é¤ì¤¿Ê¸»úÎó¤Î¤¦¤Á, 10²ó°Ê¾å½Ð¸½¤·¤¿Ê¸»úÎó(12,350¼ïÎà)¤ÎǤ°Õ¤Î2¼ïÎब, 1Ê¸Ãæ¤Ë2²ó°Ê¾å¶¦µ¯¤·¤¿É½¸½¤ÎÁȤÏ, 6,500¼ïÎà(±ä¤Ù½Ð¸½²ó¿ô21,800²ó)¤Ç¤¢¤ë¤³¤È¤Ê¤É, ÍÆ°×¤Ëµá¤á¤ë¤³¤È¤¬¤Ç¤¤¿.
In order to extract rigid expressions with a high frequency of use, new algorithms that can efficiently extract both uninterrupted and interrupted collocations from very large Japanese corpora have been proposed. More recently, the technique of applying n-gram statistics for uninterrupted collocation has been proposed. This enables the extraction of collocations in the order of string length and frequency of use. But this metbod posed problems in that large volumes of fractional and unnecessary expressions are included. To solve this problem, this paper proposes a new algorithm that restrains the extraction of unnecessary expressions. This is followed by the proposal of a method that extracts interrupted collocations combining the uninterrupted collocations thus obtained. These new methods are applied to newspaper articles containing 8.92 million characters. In the case of uninterrupted collocations with string length of 2 or more characters and whose frequency of appearance is 2 or more times, there were 4,4 million expressions (total frequency or 31.2 million times) extracted by the conventional method. In contrast, the new method reduced this to 0.97 million types (total frequency of 2.6 million times) revealing a substantial reduction in fractional and unnecessary expressions. In the case of interrupted collocational substring extractions, combining the substring with frequency of 10 times or more extracted by the first method, yielded 6.5 thousand types of pairs of substrings with the total frequency of 21.S thousand times.
ºÇ¶á, ¼«Á³¸À¸ì½èÍý¤Ë¤ª¤¤¤Æ, ÂçÎ̤Υ³¡¼¥Ñ¥¹¤äÍÑÎã¤Î½ÅÍ×À¤¬»ØÅ¦¤µ¤ì, ¤½¤ì¤òʬÀϤ¹¤ëµ»½Ñ¤ÎɬÍ×À¤¬ÁýÂ礷¤Æ¤¤¤ë. Î㤨¤Ð, µ¡³£ËÝÌõ¤Ç¤Ï, ñ¸ìñ°Ì¤ÎľÌõ¤Ç¤Ï¤¦¤Þ¤¯Ìõ¤»¤Ê¤¤¥Õ¥ì¡¼¥º¤ò½¸¤á, ¥Õ¥ì¡¼¥Öñ°Ì¤ËËÝÌõ¤¹¤ëÊýË¡¤ä, °ìÄê¤Î¹½Â¤¤ò»ý¤Äɽ¸½¤òÂÐÌõ¥Ñ¥¿¡¼¥ó²½¤·, ¥Ñ¥¿¡¼¥ó¼½ñ¤Ë¤è¤Ã¤Æ¸¶¸À¸ì¤òÌÜŪ¸À¸ì¤ËÂбþ¤Å¤±¤ëÊýË¡¤Ê¤É¤¬¹Í¤¨¤é¤ì¤Æ¤¤¤ë. ¤³¤ì¤é¤ÎÊýË¡¤ò¼Â¸½¤¹¤ë¤Ë¤Ï, ¸½¼Â¤Ë»ÈÍѤµ¤ì¤Æ¤¤¤ë¸À¸ì¥Ç¡¼¥¿¤ÎÃæ¤«¤é, »ÈÍÑÉÑÅ٤ι⤤¥Õ¥ì¡¼¥º¤äɽ¸½¤Î¥Ñ¥¿¡¼¥ó¤òÃê½Ð¤¹¤ë¤³¤È¤¬É¬ÍפǤ¢¤ë.
¤·¤«¤·, ËÄÂç¤Ê¸À¸ì¥Ç¡¼¥¿¤òÂоݤȤ¹¤ë¤È¤, Ǥ°Õ¤ÎŤµ¤Ç, ½Ð¸½ÉÑÅ٤ι⤤ɽ¸½Ê¸»úÎó¤òϳ¤ì¤Ê¤¯¼«Æ°Åª¤Ëȯ¸«¤·¤Æ, Ãê½Ð¤¹¤ë¤³¤È¤Ï, ·×»»Î̤ÎÅÀ¤Çº¤Æñ¤Ç¤¢¤Ã¤¿. ¤½¤Î¤¿¤á, ½¾Íè, ¼«Á³¸À¸ì¤È¤·¤Æ¤ÎÆÃħ¤ËÃåÌܤ¹¤ëÊýË¡, Ãê½Ð¤¹¤ëʸ»úÎó¤ÎÀ¼Á¤ËÃåÌܤ¹¤ëÊýË¡¤Ê¤É, ÌÜŪ¤Ë¹çÃפ¹¤ëʸ»úÎó¤ò¸ÂÄêŪ¤ËÃê½Ð¤¹¤ëÊýË¡¤¬¹Í¤¨¤é¤ì¤Æ¤¤¿. Î㤨¤Ð, Á°¼Ô¤ÎÊýË¡¤È¤·¤Æ¤Ï, ¸À¸ì¥Ç¡¼¥¿¤«¤é·ë¤Ó¤Ä¤¤Î¶¯¤¤Ã±¸ì¤ò¼è¤ê½Ð¤¹´ÑÅÀ¤«¤é, 2ñ¸ì¤Î·ë¤Ó¤Ä¤¤Î¶¯ÅÙ¤ËÃåÌܤ·¤¿ÊýË¡1), ñ¸ì´Ö¤Îµ÷Î¥¤ËÃåÌܤ·¤¿ÊýË¡2), ·ë¹çñ¸ì¿ô¤È½Ð¸½²ó¿ô¤ò¹Íθ¤·¤¿ ÊýË¡3),4)¤Ê¤É¤¬Ä󰯤µ¤ì¤Æ¤¤¤ë. ¸å¼Ô¤ÎÊýË¡¤È¤·¤Æ¤Ï, Ãê½Ð¤¹¤ëñ¸ì¤äʸ»ú¤ÎÏ¢º¿¤Î¿ô¤òÀ©¸Â¤·¤¿¤ê, û¤¤Ï¢º¿¤Ç½Ð¸½ÉÑÅ٤ι⤤¤â¤Î¤ËÃåÌܤ·¤Æ, ¸ÂÄꤵ¤ì¤¿Ê¸»úÎó(ñ¸ìÎó)¤ÎÈϰϤÇÏ¢º¿¿ô¤òÁý¤ä¤·¤Æ½¸·×¤¹¤ë ÊýË¡5)¤Ê¤É¤¬¹Í¤¨¤é¤ì¤Æ¤¤¤¿.
¤³¤ì¤ËÂФ·¤Æ, ºÇ¶á, ÂçÎ̤θÀ¸ì¥Ç¡¼¥¿¤òÂоݤË, Ǥ°Õ¤În¤ËÂФ¹¤ën-gramÅý·×¤ò¹â®¤Ë¼Â¹Ô¤¹¤ëÊýË¡¤¬Ä󰯤µ¤ì6), ¸À¸ì¥Ç¡¼¥¿Æâ¤Ë¤¢¤ëǤ°Õ¤ÎŤµ¤Îʸ»úÎó(°ìÈ̤ˤϵ¹æÎó)¤ò¼«Æ°Åª¤ËÃê½Ð¤·, ¤½¤Î½Ð¸½²ó¿ô¤ò¥«¥¦¥ó¥È¤¹¤ë¤³¤È¤¬²Äǽ¤È¤Ê¤Ã¤¿. ¤³¤Î·ë²Ì¤òÍѤ¤¤ì¤Ð, ¸¶Ê¸Ãæ¤Ë»ÈÍѤµ¤ì¤¿Ê¸»úÎó¤ò, ¤½¤ÎŤµ(ʸ»ú¿ô)¤Î½ç¤«¤Ä½Ð¸½·¹Å٤ι⤤½ç¤Ë½¸·×¤¹¤ë¤³¤È¤¬¤Ç¤¤ë. ¤·¤«¤·, ¤³¤ÎÊýË¡¤Ç¤Ï, Ãê½Ð¤¹¤ëʸ»úÎó´Ö¤ÎÁê¸ß´Ø·¸¤¬Ìµ»ë¤µ¤ì¤Æ¤¤¤ë¤¿¤á, ´û¤ËÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÉôʬʸ»úÎ󤬽ÅÊ£¤·¤ÆÃê½Ð¤µ¤ì¤ë. ¤·¤¿¤¬¤Ã¤Æ, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ò¸À¸ìɽ¸½¤È¤·¤Æ¸«¤¿¾ì¹ç, ʸˡŪ, °Ọ̃Ū¤Ë¤Þ¤È¤Þ¤ê¤Î¤Ê¤¤ÃÇÊÒŪ¤Êʸ»úÎó¤¬Â¿¿ô¤òÀê¤á¤ë. ¤³¤ì¤ò°ÕÌ£¤Î¤¢¤ëʸ»úÎó¤Ë¹Ê¤ê¹þ¤àÊýË¡¤È¤·¤Æ, Ʊ°ì¤ÎÏÀʸ6)¤Ç¤Ï, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤È¤½¤Î½Ð¸½²ó¿ô¤òÁê¸ß¤ËÁȤ߹ç¤ï¤»¤ëÊýË¡¤Î²ÄǽÀ¤ò¼¨¤·¤Æ¤¤¤ë. ¤½¤Î¸å, ¤³¤În-gramÅý·×¥Ç¡¼¥¿¤È¤·¤ÆÆÀ¤é¤ì¤¿Ê¸»úÎ󤫤é°ÕÌ£¤Î¤¢¤ëɽ¸½¤ò¼è¤ê½Ð¤¹ÊýË¡¤È¤·¤Æ, Ãê½Ð¤·¤¿Ê¸»úÎó¤Î¥¨¥ó¥È¥í¥Ô¡¼´ð½à¤òÍѤ¤¤ë ÊýË¡7)¤¬Ä󰯤µ¤ì¤Æ¤¤¤ë. ¤Þ¤¿, n-gramÅý·×¤ò±þÍѤ·¤¿¤â¤Î¤Ë, ½õ»ìŪÄ귿ɽ¸½¤ÎÃê½Ð¤Î Îã8)¤¬¤¢¤ë¤¬, ¤³¤ÎÊýË¡¤Ç¤Ï, ¤¢¤é¤«¤¸¤á, Ãê½Ð¤¹¤ëʸ»úÎó¤ò¹½À®¤¹¤ë»ú¼ï¤ÎÁȤò¸ÂÄꤹ¤ë¤³¤È¤Çn-gram¤Î·×»»Î̤ÎÌäÂê¤ò²óÈò¤·, ¤½¤Î¸å, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ò¼ï¡¹¤Î¥Ò¥å¡¼¥ê¥¹¥Æ¥£¥Ã¥¯¥¹¤òÍѤ¤¤Æ¹Ê¤ê¹þ¤ó¤Ç¤¤¤ë.
¼¡¤Ë, Î¥¤ì¤¿°ÌÃ֤˶¦µ¯¤¹¤ëɽ¸½¤ÎÁȤÎÃê½Ð¤ò¸«¤ë¤È, Ê£¿ô¤Îʸ»úÎó¤òÁȤ߹ç¤ï¤»¤Æ, ¸¶Ê¸Ãæ¤Ç¤Î¶¦µ¯¤òÄ´¤Ù¤ë¤³¤È¤¬É¬ÍפǤ¢¤ë. n-gramÅý·×¤Ç¤Ï, ËÄÂç¤ÊÎ̤Îʸ»úÎó¤¬Ãê½Ð¤µ¤ì¤ë¤¿¤á, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎ󤹤٤ƤòÁȤ߹ç¤ï¤»¤Æ¸¶Ê¸¤ò¥µ¡¼¥Á¤¹¤ë¤Î¤ÏʪÍýŪ¤Ë°øÆñ¤Ç¤¢¤Ã¤¿. Ï¢º¿·¿, Î¥»¶·¿¤òÆÃ¤Ë¶èÊ̤»¤º, 1Ê¸Ãæ¤Ë¶¦µ¯É½¸½¤¬Àê¤á¤ë³ä¹ç¤Î¿¤¤Ê¸¤òÄ귿Ū¤Êʸ¤È¤·¤ÆÃê½Ð¤¹¤ë »î¤ß9),10)¤â¤¢¤ë¤¬, ÂçÎ̤θÀ¸ì¥Ç¡¼¥¿¤ÎÃæ¤«¤é, ½Ð¸½ÉÑÅ٤ι⤤ʸ»úÎó¤ÎÁȤò, ϳ¤ì¤Ê¤¯¼«Æ°Åª¤Ëȯ¸«¤·½¸·×¤¹¤ë¤Î¤Ë¸ú²ÌŪ¤ÊÊýË¡¤ÏÃΤé¤ì¤Æ¤¤¤Ê¤¤.
¤È¤³¤í¤Ç, ÂçÎ̤θÀ¸ì¥Ç¡¼¥¿¤òÂоݤȤ¹¤ë¤È¤, ¶¦µ¯É½¸½Ãê½Ð¤ÎÌäÂê¤Ï, Âè1¤Ë, ·×»»ÎÌ(¥Õ¥¡¥¤¥ëÎÌ)ÁýÂç¤Ë¤è¤ë·×»»²ÄÈݤÎÌäÂê¤Ç¤¢¤ê, Âè2¤Ë, ÆÀ¤é¤ì¤¿ÂçÎ̤ηë²Ì¤«¤éɬÍפÊɽ¸½¤òÁªÂò¤¹¤ëÌäÂê¤Ç¤¢¤ë. ÆÃ¤Ë, Î¥»¶·¿¶¦µ¯¤Î¾ì¹ç, ·×»»Î̤Ï, ¤½¤ì¤ò¹½À®¤¹¤ëɽ¸½Í×ÁǤοô¤ËÂФ·¤Æ´ö²¿µé¿ôŪ¤ËÁý²Ã¤¹¤ë¤³¤È¤¬ÌäÂê¤È¤Ê¤ë. ·×»»Î̤òºï¸º¤¹¤ëÊýË¡¤ò¹Í¤¨¤ëºÝ¤Ï, ¶¦µ¯É½¸½Ãê½Ð¤ÎÌÜŪ¤«¤é¹Í¤¨¤Æ, ɬÍפʶ¦µ¯É½¸½¤òϳ¤é¤·¤Æ¤·¤Þ¤¦¤è¤¦¤Ê¹Ê¤ê¹þ¤ß¤Ï˾¤Þ¤·¤¯¤Ê¤¤.
Ï¢º¿·¿¤Îʸ»úÎóÃê½Ð¤Î¾ì¹ç¤Ï, n-gramÅý·×¤ÎÊýË¡¤Ë¤è¤Ã¤Æ, ¤¹¤Ç¤Ë, Âè1¤ÎÌäÂê¤Ï²ò·è¤µ¤ì¤Æ¤¤¤ë¤¬, ɽ¸½¤Îñ°Ì¤È¤ß¤Ê¤»¤Ê¤¤(ñ¸ì¤ÎÃÇÊÒ¤ò´Þ¤à)ÃÇÊÒŪ¤Êʸ»úÎó¤¬Â¿¼ÏÃê½Ð¤µ¤ì¤ë. ¤³¤Î¤¿¤á, Î¥»¶·¿¶¦µ¯¤Î¾ì¹ç, ·×»»Î̤¬ÁýÂ礷, ·×»»ÉÔ²Äǽ¤È¤Ê¤ë¤³¤È¤¬ÌäÂê¤È¤Ê¤ë. ÃÇÊÒŪ¤Êʸ»úÎó¤ÎÃê½Ð¤¬ÍÞÀ©¤µ¤ì, ·×»»Î̤¬²Äǽ¤ÊÈϰϤ˼ý¤Þ¤ì¤Ð, Î¥»¶·¿¶¦µ¯¤Ë¤ª¤¤¤Æ¤â, Âè1¤ÎÌäÂê¤Ï²ò·è¤¹¤ë. ¤Þ¤¿, Âè2¤ÎÌäÂê¤Ë¤Ä¤¤¤Æ¤Ï, ºÇ½ªÅª¤Ë¤Ï, »ÈÍÑÌÜŪ¤´¤È¤Ë¿Í¼ê¤ÇȽÃǤ»¤¶¤ë¤òÆÀ¤Ê¤¤¤«¤é, ½ÐÎϤµ¤ì¤ëʸ»úÎó¤ÎÎÌ(¼ïÎà)¤¬, ¿Í¼êºî¶È¤Ë»Ù¾ã¤Î¤Ê¤¤ÈϰÏ(¿ôÀé¼ï, ºÇÂç¿ôËü¼ï°Ê²¼)¤Ë¤Ê¤ì¤Ð, Âè2¤ÎÌäÂê¤âÅöÌ̲ò·è¤·¤¿¤È¸À¤¨¤ë.
°Ê¾å¤Î´ÑÅÀ¤«¤é, ËÜÏÀʸ¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯É½¸½Ãê½Ð¤Ë¤ª¤¤¤ÆÃÇÊÒŪ¤Êʸ»úÎóÃê½Ð¤òÍÞÀ©¤¹¤ëÊýË¡¤È¤·¤Æ, ¸À¸ì¥Ç¡¼¥¿¤ÎÃæ¤«¤é, ºÇŰìÃפÎʸ»úÎóÃê½Ð(¤¢¤ëʸ»úÎó¤¬Ãê½Ð¤µ¤ì¤¿¤È¤, ¤½¤Îʸ»úÎó¤Ë´Þ¤Þ¤ì¤ëÉôʬʸ»úÎó¤ÏÃê½Ð¤·¤Ê¤¤)¤ò¾ò·ï¤È¤·, Ǥ°Õ¤ÎŤµ°Ê¾å, Ǥ°Õ¤Î»ÈÍÑÉÑÅٰʾå¤Î¶¦µ¯É½¸½¤ò, ϳ¤ì¤Ê¤¯, ¼«Æ°Åª¤ËÃê½Ð¤·, ½¸·×¤¹¤ëÊýË¡¤òÄ󰯤¹¤ë. ¼¡¤Ë, ¤½¤Î·ë²Ì¤ò»ÈÍѤ·¤Æ, Ê£¿ô¤ÎÍ×ÁǤ¬Î¥¤ì¤¿°ÌÃ֤˶¦µ¯¤¹¤ëÎ¥»¶·¿¶¦µ¯É½¸½¤ò¼«Æ°Åª¤ËÃê½Ð¤·½¸·×¤¹¤ëÊýË¡¤ò¼¨¤¹. ¤Þ¤¿, Ä󰯤·¤¿¼êË¡¤Îưºî³Îǧ¤Î¤¿¤á¤ÎŬÍÑÎã¤È¤·¤Æ, ÆüËܸ쿷ʹµ»ö¥Ç¡¼¥¿¤«¤é¤ÎÏ¢º¿·¿, Î¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½Ð·ë²Ì¤ò¼¨¤¹.
(1) ʸ»úÎóÃê½Ð¤Î¾ò·ï
¼«Á³¸À¸ì¤ÎÊ¸Ãæ¤Ç¶¦µ¯¤¹¤ëɽ¸½¡ù¤È¤·¤Æ¤Ï, Ï¢¸ì¤ä¥Õ¥ì¡¼¥º¤Î¤è¤¦¤ËϢ³¤·¤¿Ê¸»úÎó¤ò¹½À®¤¹¤ë¤â¤Î(Ï¢º¿·¿¶¦µ¯É½¸½¤È¸Æ¤Ö)¤È, ·¸¤ê·ë¤Ó, ¸Æ±þ´Ø·¸, ÆÃÄê¤Îư»ì¤ÈÆÃÄê¤Î̾»ì¤ÎÁȤʤɤΤ褦¤Ë, 2¼ïÎà°Ê¾å¤Îʸ»úÎó¤¬, Ê¸Ãæ¤ÎÎ¥¤ì¤¿°ÌÃ֤˸½¤ì¤ë¤â¤Î(Î¥»¶·¿¶¦µ¯É½¸½¤È¸Æ¤Ö)¤¬¤¢¤ë. Î¥»¶·¿¶¦µ¯É½¸½¤Ï, Ï¢º¿·¿¶¦µ¯É½¸½¤Îʸ»úÎó¤¬Ê¸Ãæ¤Ç¶¦µ¯¤·¤¿¤â¤Î¤È¹Í¤¨¤ë¤³¤È¤¬¤Ç¤¤ë¤«¤é, ¤Þ¤º, Á°¼Ô¤Îʸ»úÎó¤ò¹Í¤¨¤ë.
¤µ¤Æ, Ï¢¸ì¤ä¥Õ¥ì¡¼¥º¤Î¤è¤¦¤ÊϢ³¤·¤¿Ê¸»úÎó¤òϳ¤ì¤Ê¤¯È¯¸«¤¹¤ë¤³¤È, ¤Þ¤¿, ʸˡŪ, °Ọ̃Ū¤Ë¸«¤Æ, ɽ¸½¤Îñ°Ì¤ò¤Ê¤µ¤Ê¤¤¤è¤¦¤ÊÃÇÊÒŪ¤Êʸ»úÎó¤ÎÃê½Ð¤òºÇ¾®¸Â¤Ë²¡¤µ¤¨¤ë¤³¤È¤òÁÀ¤Ã¤Æ, °Ê²¼¤Î¾ò·ï¤Çʸ»úÎó¤òÃê½Ð¤¹¤ë¤³¤È¤È¤¹¤ë.
| Âè1¤Î¾ò·ï: | Ǥ°Õ¤ÎŤµ°Ê¾å¤Îʸ»úÎó¤òÃê½Ð¤¹¤ë. | |
| Âè2¤Î¾ò·ï: | Ǥ°Õ¤Î½Ð¸½ÉÑÅٰʾå¤Îʸ»úÎó¤òÃê½Ð¤¹¤ë. | |
| Âè3¤Î¾ò·ï: | ºÇŰìÃפθ¶Â§¤Çʸ»úÎó¤òÃê½Ð¤¹¤ë. |
¤³¤Î¤¦¤ÁÂè3¤Î¾ò·ï¤Ï, ¸¶Ê¸Ãæ¤Î¤¢¤ë¾ì½ê¤«¤é¤¢¤ëʸ»úÎ󤬰ìÅÙÃê½Ð¤µ¤ì¤¿¸å¤Ï, ¤½¤Îʸ»úÎóÆâ¤Ë´Þ¤Þ¤ì¤ëÉôʬʸ»úÎó¤ÏÃê½Ð¤ÎÂоݤȤ·¤Ê¤¤¤³¤È¤ò°ÕÌ£¤¹¤ë. ¤¿¤À¤·, ¤½¤ÎÉôʬʸ»úÎó¤¬Ê̤ξì½ê¤Ë¸½¤ì¤¿»þ¤ÏÃê½Ð¤µ¤ì¤ë. Î㤨¤Ð, ¿Þ1¤Î¾ì¹ç, 7gram¤Îʸ»úÎó¦Á¤¬Ãê½Ð¤µ¤ì¤¿¤È¤¹¤ë¤È, ¤½¤ì°Ê¹ß¤Î6gram°Ê²¼¤Îʸ»úÎó¤ÎÃê½Ð¤Ç¤Ï, ¦ÁÉôʬ¤ÎÉôʬʸ»úÎó¤Ç¤¢¤ë¦Â¤ä¦Ã¤ÏÂоݳ°¤È¤¹¤ë. ¤¿¤À¤·, ¦Á¤¬Ãê½Ð¤µ¤ì¤¿¾ì½ê°Ê³°¤Î°ÌÃ֤˸½¤ì¤¿¡ÖDE¡×, ¡ÖGHI¡×¤ÏÅöÁ³, Ãê½Ð¤ÎÂоݤȤʤë. ¤Þ¤¿, ʸ»úÎó¦Ä¤Ï, ¦Á¤ÎÉôʬʸ»úÎó¤Ç¤Ê¤¤¤Î¤Ç, Ãê½Ð¤ÎÂоݤȤ¹¤ë.
|
(2) ɽ¸½Ãê½Ð¤Ë¤ª¤±¤ëºÇŰìÃפθ¶Â§¤Î°ÕµÁ
°ìÈ̤˸À¸ìɽ¸½¤Ï, Âç¾®¤Îɽ¸½¤¬´ö½Å¤Ë¤â¥Í¥¹¥È¤·¤Æ¹½À®¤µ¤ì¤ë. ¶¦µ¯É½¸½¤ÎÃê½Ð¤Ç¤Ï, ɽ¸½¤Îñ°Ì¤äŤµ¤ò¤¢¤é¤«¤¸¤á»ØÄꤷ¤Ê¤¯¤Æ¤â, ¤³¤Î¤è¤¦¤Êɽ¸½¤ÎÃæ¤«¤é, ·«¤êÊÖ¤·»ÈÍѤµ¤ì¤ëɽ¸½¤Îñ°Ì¤ò¼«Æ°Åª¤Ëȯ¸«¤·, Ãê½Ð¤Ç¤¤ë¤³¤È¤¬Ë¾¤Þ¤ì¤ë. ¤½¤³¤Ç, ¤¹¤Ù¤Æ¤Îʸ»úÎó¤òÌÖÍåŪ¤ËÃê½Ð¤¹¤ì¤Ð, ¤½¤Î¤è¤¦¤Êɽ¸½¤ÏÃê½Ð¤µ¤ì¤ë¤¬, °ìÅÙÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÃæ¤«¤é¤âÉôʬʸ»úÎ󤬽ÅÊ£¤·¤ÆÃê½Ð¤µ¤ì¤ë¤¿¤á, ¿¤¯¤ÎÃÇÊÒŪ¤Êʸ»úÎ󤬴ޤޤì¤ë¤³¤È¤¬ÌäÂê¤È¤Ê¤ë.
¤È¤³¤í¤Ç, ¸À¸ì¤Î¶¦µ¯É½¸½¤Ï, Ê£¿ô¤Îñ¸ì¤¬¶¦µ¯¤·¤¿É½¸½¤À¤È¹Í¤¨¤ë¤È, ¶¦µ¯É½¸½¤Îʸ»úÎó¤Î¶³¦¤Ï, Ʊ»þ¤Ëñ¸ì¶³¦¤È¤â¤Ê¤Ã¤Æ¤¤¤ë. °ìÊý, ²Äǽ¤Ê¸Â¤êŤ¤Ã±°Ì¤Çʸ»úÎó¤òÃê½Ð¤¹¤ì¤Ð, ¤½¤Îʸ»úÎó¤Î¶³¦¤Ïñ¸ì¶³¦¤Ë°ìÃפ¹¤ë²ÄǽÀ¤¬¹â¤¤¤«¤é, ÃÇÊÒŪ¤Êʸ»úÎó¤Ç¤Ï¤Ê¤¯¶¦µ¯É½¸½¤Ç¤¢¤ë²ÄǽÀ¤¬¹â¤¯¤Ê¤ë. ¤¹¤Ê¤ï¤Á, ÃÇÊÒŪʸ»úÎó¤ÎÃê½Ð¤¬ÍÞÀ©¤µ¤ì¤ë¤È´üÂÔ¤µ¤ì¤ë. °Ê¾å¤«¤é(1)¤Ç¤Ï, Âè3¤Î¾ò·ï¤òÀߤ±¤¿.
¤³¤³¤Ç, Âè3¤Î¾ò·ï¤ÇÃê½Ð¤¬ÍÞÀ©¤µ¤ì¤ëʸ»úÎó¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë. ÍÞÀ©¤µ¤ì¤ëʸ»úÎó¤Ë¤Ï, ¤è¤êÂ礤Êʸ»úÎó¤ÎÉôʬ¤È¤·¤Æ¤·¤«»ÈÍѤµ¤ì¤Ê¤¤¤¿¤á, °ìÅÙ¤âÃê½Ð¤µ¤ì¤Ê¤¤¤â¤Î¤È, ¾¤ÎÉôʬ¤«¤é¤ÏÆÈΩÀ¤Î¤¢¤ëɽ¸½¤È¤·¤Æ²¿²ó¤«Ãê½Ð¤µ¤ì¤ë¤¬, ¤¢¤ëʸ»úÎó¤ÎÉôʬʸ»úÎó¤È¤·¤Æ»ÈÍѤµ¤ì¤¿Éôʬ¤Ç¥«¥¦¥ó¥È¤¬ÍÞÀ©¤µ¤ì¤í¤â¤Î¤¬¤¢¤ë. ¶¦µ¯É½¸½¤ÎÌÖÍåÀ¤Î´ÑÅÀ¤«¤é¸«¤ì¤Ð, ¤³¤Î¤¦¤Á, Á°¼Ô¤ÎÃê½Ðϳ¤ì¤¬ÌäÂê¤Ç, ¤½¤ÎÃæ¤Ë, ɽ¸½¤È¤ß¤Ê¤»¤ëʸ»úÎ󤬴ޤޤì¤ë¤«¤É¤¦¤«¤¬ÂçÀڤǤ¢¤ë.
¤·¤«¤·, ¤¢¤ëɽ¸½¤¬¤è¤êÂ礤Êʸ»úÎó¤ÎÃæ¤ËËä¤â¤ì¤¿ÉôʬŪ¤Êɽ¸½¤Ç¤¢¤Ã¤Æ¤â, ÆÈΩÀ¤¬¹â¤¯, ·«¤êÊÖ¤·¤Æ»ÈÍѤµ¤ì¤ë¤è¤¦¤Êɽ¸½¤Ç¤¢¤ì¤Ð, ¤¢¤ëʸ»úÎó¤ÎÉôʬʸ»úÎó¤È¤·¤Æ¤À¤±¤Ç¤Ê¤¯, ¤½¤ì¼«¿È¤¬ºÇŤÎñ°Ì¤Ç¤¢¤ë¤è¤¦¤Êʸ»úÎó¤È¤·¤Æ·«¤êÊÖ¤·½Ð¸½¤¹¤ë¤³¤È¤¬ ´üÂԤǤ¤ë¡ù. °Ê¾å¤«¤é, Âè3¤Î¾ò·ï¤¬¤¢¤Ã¤Æ¤â, ·«¤êÊÖ¤·»ÈÍѤµ¤ì¤ë¶¦µ¯É½¸½(¤Î¼ïÎà)¤Ï, ÌÖÍåŪ¤ËÃê½Ð¤µ¤ì¤ë¤â¤Î¤È´üÂԤǤ¤ë.
(3) Ä¹Èø¡¦¿¹¤ÎÊýË¡¤È¤½¤ÎÌäÂêÅÀ
Ǥ°Õ¤În¤ËÂФ¹¤ën-gram¤ò¸úΨŪ¤ËÃê½Ð¤·¤Æ½¸·×¤¹¤ëÊýË¡¤È¤·¤Æ, ´û¤Ë, Ä¹Èø¡¦¿¹¤ÎÊýË¡6)¤¬Ä󰯤µ¤ì¤Æ¤¤¤ë. ¤³¤ÎÊýË¡¤òÍ×Ì󤹤ë¤È°Ê²¼¤Î¤È¤ª¤ê¤Ç¤¢¤ë.
[Ä¹Èø¡¦¿¹¤ÎÊýË¡]
½¸·×ÂоݤȤ¹¤ë¸À¸ì¥Ç¡¼¥¿Á´ÂΤÎʸ»ú¿ô¤ò N ¤È¤¹¤ë.
¼ê½ç1: ¡Ö¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤ÎºîÀ®¡×
N ¸Ä¤Î¥ì¥³¡¼¥É¤«¤é¤Ê¤ë¥Õ¥¡¥¤¥ë(¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë)¤òÍѰդ·, ³Æ¥ì¥³¡¼¥É¤Ë, 0¤«¤é½ç¤Ë N -1 ¤ÎÃÍ(¸¶Ê¸ÈÖÃÏ)¤ò¤¤¤ì¤ë. ¸¶Ê¸ÈÖÃϤÏ, ¸À¸ì¥Ç¡¼¥¿¾å, ¤½¤ÎÃͤǼ¨¤µ¤ì¤ëʸ»úÈֹ椫¤é»Ï¤Þ¤ê, ËöÈø( N -1 ÈÖÌܤÎʸ»ú)¤Ç½ª¤ï¤ëÉôʬʸ»úÎó(°Ê²¼, ʸ»úÎóñ¸ì¤È¸Æ¤Ö)¤Ø¤Î ¥Ý¥¤¥ó¥¿¤Î°ÕÌ£¤ò»ý¤Ä.
¼ê½ç2: ¡ÖÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ÎºîÀ®¡×
¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤ò, Âбþ¤¹¤ëʸ»úÎóñ¸ì¤Îʸ»ú¥³¡¼¥É½ç¤Ë, ¥½¡¼¥È¤·¤¿¥Õ¥¡¥¤¥ë(ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë)¤ò¤Ä¤¯¤ë.
¼ê½ç3: ¡Ö°ìÃ×ʸ»ú¿ô¤Î¥«¥¦¥ó¥È¡×
ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤Î¼¨¤¹Ê¸»úÎóñ¸ì¤ò, ¤½¤Îľ¸å¤Î¥ì¥³¡¼¥É¤Îʸ»úÎóñ¸ì¤ÈÀèÆ¬Ê¸»ú¤«¤éÈæ³Ó¤·, °ìÃפ·¤¿Ê¸»ú¿ô(°ìÃ×ʸ»ú¿ô)¤ò½ñ¤¹þ¤à.
¼ê½ç4: ¡Öʸ»úÎó¤ÎÃê½Ð¤È¥«¥¦¥ó¥È¡×
°ìÃ×ʸ»ú¿ô¤ò¥ì¥³¡¼¥É½ç¤ËÄ´¤Ù, Éôʬʸ»úÎó¤Î¼ïÎà¤È¤½¤Î½Ð¸½²ó¿ô¤òÊÔ½¸¤¹¤ë.
¤³¤ÎÊýË¡¤Ë¤è¤ê, Ǥ°Õ¤Î²ó¿ô°Ê¾å½Ð¸½¤·¤¿Ê¸»úÎó¤òŤµ(ʸ»ú¿ô)¤´¤È¤Ë, ¤«¤Ä, ½Ð¸½²ó¿ô¤ÎÂ礤¤½ç¤ËÆÀ¤ë¤³¤È¤¬¤Ç¤¤ë¤¿¤á, ÌÜɸ¤È¤¹¤ëÂè1, Âè2¤Î¾ò·ï¤ÏËþ¤µ¤ì¤ë¤¬, ÌÜɸ¤È¤¹¤ëÂè3¤Î¾ò·ï¤ÏËþ¤µ¤ì¤Ê¤¤.
Âè3¤Î¾ò·ï¤òËþ¤¿¤¹¤è¤¦¤Ë¤¹¤ë¤¿¤á, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Î½Ð¸½²ó¿ô¤ËÂФ·¤Æ, ¤è¤êŤ¤Ê¸»úÎó¤Ë´Þ¤Þ¤ì¤Æ¤¤¤¿Éôʬʸ»úÎó¤Î½Ð¸½²ó¿ô¤òº¹¤·°ú¤¯¤Ê¤É, ¼¡¿ô¤Î°Û¤Ê¤ëÊ£¿ô¤În-gram ½¸·×ɽ¤òÁȤ߹ç¤ï¤»¤Æ·×»»¤¹¤ëÊýË¡¤¬¹Í¤¨¤é¤ì¤ë¤¬, ½¸·×ɽ¤¬À¸À®¤µ¤ì¤¿»þÅÀ¤Ç¤Ï, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Î¸¶Ê¸Ãæ¤Ç¤ÎÁê¸ß´Ø·¸¤Î¾ðÊ󤬼º¤ï¤ì¤Æ¤¤¤ë¤¿¤á, ·×»»¤ÏÉÔ²Äǽ¤Ç¤¢¤ë¡ù.
ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ËÌá¤Ã¤Æ, ¸À¸ì¥Ç¡¼¥¿¤ÎÃæ¤Ç, °ìÅÙÃê½Ð¤·¤¿Ê¸»úÎó¤ÎÉôʬ¤ÏÊ̤Îʸ»úÎó¤È¤·¤Æ²þ¤á¤ÆÃê½Ð¤·¤¿¤ê, ¥«¥¦¥ó¥È¤·¤¿¤ê¤·¤Ê¤¤ÊýË¡¤ò¹Í¤¨¤ë. °Ê²¼¤Ç¤Ï, ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤«¤é, °ìÃ×ʸ»ú¿ô¤Î¿¤¤½ç¤Ë, Éôʬʸ»úÎó¤òÃê½Ð¤¹¤ë¤â¤Î¤È¤·¤ÆµÄÏÀ¤¹¤ë.
¤µ¤Æ, n-gram ʸ»úÎó¤Èm-gram ʸ»úÎó¤ÎÃê½Ð¤ò¹Í¤¨¤ë. n ¡äm ¤È¤¹¤ë¤È, ¾ò·ï¤è¤ê, n-gram ʸ»úÎó¤ÎÃê½Ð¤Ï, m-gram ʸ»úÎó¤ËÀèΩ¤Ã¤Æ¼Â¹Ô¤µ¤ì¤ë. ¸¶Ê¸¾å, n-gram ʸ»úÎó¤Èm-gram ʸ»úÎ󤬶¦ÄÌÉôʬ¤ò»ý¤Ä¾ì¹ç¤¬ÌäÂê¤È¤Ê¤ë¤«¤é, ¤½¤ì¤òʬÎह¤ë¤È, ¿Þ2¤Î¤è¤¦¤Ë, m-gram ʸ»úÎó¤¬n-gram ʸ»úÎóÆâ¤ËÆâÊñ¤µ¤ì¤ë¾ì¹ç¤È, m-gram ʸ»úÎó¤Èn-gram ʸ»úÎ󤬸ߤ¤¤Ë¤½¤ÎÉôʬ¤ò¶¦Í¤¹¤ë¾ì¹ç¤Ëʬ¤±¤é¤ì¤ë.
|
(1) ̵¸ú²½¤ÎɬÍפʥ쥳¡¼¥É¤ÎÈϰÏ
n-gram ¤¬Àè¹Ô¤·¤ÆÃê½Ð¤µ¤ì¤¿¤È¤, case1 ¤Îm-gram ¤Ï, ¤¤¤º¤ì¤âÃê½ÐÂоݤȤʤé¤Ê¤¤. ¤·¤¿¤¬¤Ã¤Æ, n-gram ʸ»úÎó¤òÃê½Ð¤¹¤ë¤È¤, ¤³¤Î¤è¤¦¤Ê´Ø·¸¤Ë¤¢¤ëm-gram ¤Ï, ¸å¤Î½èÍý¤ÇÃê½Ð¤µ¤ì¤Ê¤¤¤è¤¦¤Ë¤¹¤ëɬÍפ¬¤¢¤ë. ¤½¤³¤Ç, n-gram ¤¬Ãê½Ð¤µ¤ì¤¿¤È¤, ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¾å¤Ç, ¤½¤ì¤ËÊñ´Þ¤µ¤ì¤ëm-gram ¤òõ¤·¤Æ, ³ºÅö¥ì¥³¡¼¥É¤¬Ìµ¸ú¤È¤µ¤ì¤ë¾ò·ï¤òÉÕÍ¿¤¹¤ëÊýË¡¤ò¹Í¤¨¤ë.
¤½¤³¤Ç, ¤Þ¤ºÌµ¸ú²½¤ÎÂоݤȤʤë¥ì¥³¡¼¥É¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë¤È, case 1-1 ¤Î¾ì¹ç¤Ï, Ãê½Ð¤µ¤ì¤¿n-gram ¤Î¥ì¥³¡¼¥É¼«ÂΤ¬ºÆ¤ÓÃê½Ð¤ÎÂоݤˤʤé¤Ê¤¤¤è¤¦¤Ë¤¹¤ì¤Ð¤è¤¤. ¼¡¤Ë, case 1-2, case 1-3 ¤Î¾ì¹ç¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë¤È, ̵¸ú²½¤ÎÂоݤȤʤë¥ì¥³¡¼¥É¤Ï, ¸¶Ê¸¾å, ÃåÌܤ¹¤ë n-gram ¤Î³«»Ïʸ»ú¤Î°ÌÃÖ¤«¤é¿ô¤¨¤Æ n ʸ»úÀè¤Þ¤Ç¤Î³ÆÊ¸»ú¤òÀèÆ¬Ê¸»ú¤È¤¹¤ëʸ»úÎóñ¸ì¤Î¥ì¥³¡¼¥É¤Ç¤¢¤ë¤³¤È¤¬Ê¬¤«¤ë.
¼¡¤Ë, ̵¸ú²½¤Î¾ò·ï¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë¤È, case 2-2 ¤Î¾ì¹ç¤Îm-gram ¤Ï̵¸ú²½¤·¤Æ¤Ï¤Ê¤é¤Ê¤¤¤«¤é, ¾åµ¤ÎÂоݥ쥳¡¼¥É¤Î¤¦¤Á, ̵¸ú²½¤¹¤ë¥ì¥³¡¼¥É¤Ï, °ìÃ×ʸ»ú¿ô¤¬¤½¤ì¤¾¤ì n - 1, n - 2, ¡¦¡¦¡¦, 1 °Ê²¼¤Î¥ì¥³¡¼¥É¤Ë¸Â¤é¤ì¤ë¤³¤È¤¬Ê¬¤«¤ë. ¤Ê¤ª, case 2-1 ¤Ë¤¢¤ë¤è¤¦¤Êm-gram ¤Î¾ì¹ç¤Ï, ¾åµ¤Î̵¸ú²½½èÍý¤ÎÂоݳ°¤È¤Ê¤Ã¤Æ¤ª¤ê, Ãê½Ð½¸·×¤ÎÂоݤȤʤë.
°Ê¾å¤Î̵¸ú²½½èÍý¤ÎÂоÝÈϰϤˤĤ¤¤Æ, ¿Þ3 ¤ËÎã¤ò¼¨¤¹. ¿Þ¤Ç¤Ï, ¸¶Ê¸ÈÖÃÏ3 ¤Î¥ì¥³¡¼¥É¤«¤é6 gram ¤Îʸ»úÎó, ¡ÖC¡ÁH¡× ¤¬Ãê½ÐÂоݤÈȽÃǤµ¤ì¤¿¤È¤¤Ï, ¸¶Ê¸ÈÖÃÏ4¡Á8 ¤Îʸ»úÎó¤ÎH ¤Þ¤Ç¤ÎÉôʬ¤¬Ìµ¸ú²½¤µ¤ì¤ë¤³¤È¤ò¼¨¤·¤Æ¤¤¤ë.
|
(2) ̵¸ú²½¤¹¤Ù¤¥ì¥³¡¼¥É¤Î¸¡º÷
ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Î¥ì¥³¡¼¥É¤Ï, ʸ»úÎóñ¸ì¤ò¼¨¤¹¸¶Ê¸ÈÖÃϤÎÃÍi ¤ËÂФ·¤Æ½çÉÔÆ±¤Ëʤó¤Ç¤¤¤ë. ¤½¤Î¤¿¤á, ¤¢¤ë¥ì¥³¡¼¥É¤Î¸¶Ê¸ÈÖÃϤÎÃÍi ¤ò¸«¤Æ, ¸¶Ê¸ÈÖÃϤÎÃͤ¬, i + 1, i + 2, ¡¦¡¦¡¦ ¤È¤Ê¤Ã¤Æ¤¤¤ë¥ì¥³¡¼¥É¤òõ¤¹¤Ë¤Ï, ¥·¡¼¥±¥ó¥·¥ã¥ë¥µ¡¼¥Á¤¬É¬ÍפÇ, ¸¡º÷»þ´Ö¤¬Â礤ÊÌäÂê¤È¤Ê¤ë. ¤³¤ì¤ËÂФ·¤Æ, ¸µ¤Î¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤Ç¤Ï, ¥ì¥³¡¼¥É¤Ï¸¶Ê¸¹áÃϤÎÃÍi ¤Î½ç¤Ëʤó¤Ç¤¤¤ë. ¤¹¤Ê¤ï¤Á, ̵¸ú²½¤ÎÍ׵᤬ȯÀ¸¤·¤¿¥ì¥³¡¼¥É¤Ë°ú¤Â³¤¤¤Æ, ̵¸ú²½¤ò¥Á¥§¥Ã¥¯¤¹¤Ù¤¥ì¥³¡¼¥É¤¬½çÈÖ¤Ëʤó¤Ç¤¤¤ë¤¿¤á, ¸¡º÷¤Ï¹â®¤Ë¼Â¹Ô¤Ç¤¤ë. ¤½¤³¤Ç, ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ò¤â¤¦°ìÅÙ, ¸¶Ê¸ÈÖÃϤÎÃͤνç¤ËºÆ¥½¡¼¥È¤·, ÆÀ¤é¤ì¤¿¥Õ¥¡¥¤¥ë¾å¤Ç̵¸ú²½½èÍý¤ò ¹Ô¤¦¤â¤Î¤È¤¹¤ë¡ù.
(1) ¥¢¥ë¥´¥ê¥º¥à
Á°¾Ï¤ÎµÄÏÀ¤ò¤Õ¤Þ¤¨, ¸À¸ì¥Ç¡¼¥¿¤«¤é, 2 ²ó°Ê¾å¤Î½Ð¸½²ó¿ô¤ò»ý¤Ä¸ÇÄêŪ¤Ê(ÆÈΩÀ¤Î¹â¤¤) ɽ¸½¤òʸ»úÎó¤È¤·¤Æ, ʸ»ú¿ô¤Î¿¤¤½ç¤Ë, ¤«¤Ä, ½ÅÊ£¤Ê¤·¤ËÃê½Ð¤¹¤ë¥¢¥ë¥´¥ê¥º¥à¤òÄ󰯤¹¤ë.
[ʸ»úÎóÃê½Ð¥¢¥ë¥´¥ê¥º¥à]
¼ê½ç1¡Á¼ê½ç3 : Ä¹Èø¡¦ ¿¹¤ÎÊýË¡¤ÈƱ¤¸
¼ê½ç4 : ¡ÖÃ껳ʸ»ú¿ô¤ÎµÆþ¡×
ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤Î¼¨¤¹Ê¸»úÎóñ¸ì¤Ë¤Ä¤¤¤Æ, ÀèÆ¬¤«¤é²¿Ê¸»úÃê½ÐÂоݤȤʤäƤ¤¤ë¤«(Ãê½Ðʸ»ú¿ô) ¤òÄ´¥Ù, ¥ì¥³¡¼¥É¤ËµÆþ¤¹¤ë (³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤¬¤Ç¤¤ë). Ãê½Ðʸ»ú¿ô¤Ï, Á°¸å¤Î¥ì¥³¡¼¥É¤Î°ìÃ×ʸ»ú¿ô¤Î´Ø·¸¤«¤é´Êñ¤Ë·è¤Þ¤ë.
¼ê½ç5 : ¡Ö³ÈÄ¥¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤ÎºîÀ®¡×
³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ò¸¶Ê¸ÈÖ¹æ½ç¤Ë¥½¡¼¥È¤·Ä¾¤·, ³ÈÄ¥¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤È¤¹¤ë.
¼ê½ç6 : ¡Ö͸ú̵¸úȽÄê½èÍý¡×
³ÈÄ¥¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤ÎÃê½Ðʸ»ú¿ô¤ò½ç¤ËÄ´¥Ù, ³Æ¥ì¥³¡¼¥É¤Î̵¸úȽÄê¤ò¹Ô¤¦. ¤½¤Î·ë²Ì¤ÏºÎÈÝɽ¼¨¤ÎÃͤȤ·¤ÆµÆþ¤¹¤ë. ̵¸úȽÄê¤ÎÊýË¡¤Ï, 3.1 Àá¤Ç½Ò¤Ù¤¿¤È¤ª¤ê¤Ç¤¢¤ë.
¾åµ¤ÇÆÀ¤é¤ì¤¿³ÈÄ¥¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤òºÆÅÙ, ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Î¥ì¥³¡¼¥É½ç¤Ë¥½¡¼¥È¤·, ¤³¤ì¤òºÆ³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤È¤¹¤ë.
¼ê½ç8 : ¡ÖÃê½Ðʸ»úÎ󽸷׽èÍý¡×
ºÆ³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ÎºÎÈÝɽ¼¨, Ãê½Ðʸ»ú¿ô, °ìÃ×ʸ»ú¿ô¤Î´Ø·¸¤òÄ´¤Ù¤ÆÃê½Ð¤¹¤ëʸ»úÎó¤ò·èÄꤷ, Ʊ»þ¤Ë, ¤½¤Î½Ð¸½²ó¿ô¤òµá¤á¤ë.
¤³¤Î¤È¤, Á°¸å¤Î¥ì¥³¡¼¥É¤Î°ìÃ×ʸ»ú¿ô¤Î´Ø·¸¤«¤éÃê½Ðʸ»ú¿ô¤Ïµá¤á¤é¤ì¤ë (¼ê½ç4 »²¾È) ¤¿¤á, Ãê½Ðʸ»ú¿ô¤Ï»²¾È¤·¤Ê¤¯¤Æ¤â½¸·×¤Ç¤¤ë.
(2) ÎãÂ긡Ƥ
°Ê¾å¤Î¥¢¥ë¥´¥ê¥º¥à¤ÎŬÍÑÎã¤ò¿Þ4 ¤Ë¼¨¤¹. ¤³¤ÎÎã¤Ç¤Ï, n-gram Åý·×¤ÇÃê½Ð¤µ¤ì¤ëʸ»úÎó¤Î¼ïÎब24 ¼ïÎà¤Ç, ±ä¤Ù½Ð¸½²ó¿ô¤¬72 ²ó¤Ç¤¢¤ë¤Î¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, 5 ¼ïÎà, 10 ²ó¤Ë¹Ê¤é¤ì¤ë.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Æó¤Ä°Ê¾å¤Îɽ¸½¤¬, 1 Ê¸Ãæ¤ÎÎ¥¤ì¤¿°ÌÃ֤˶¦µ¯¤¹¤ë¤è¤¦¤Êɽ¸½¤ÎÁÈ(Î¥»¶·¿¶¦µ¯É½¸½) ¤È, ¤½¤Î½Ð¸½²ó¿ô¤òµá¤á¤ëÊýË¡¤ò¹Í¤¨¤ë. Ï¢º¿·¿¶¦µ¯É½¸½¤ÎÃê½Ð (3 ¾Ï¤ÎÊýË¡) ¤Ç¤Ï, Ê£¿ô¤Îʸ¤Ë¤Þ¤¿¤¬¤ëʸ»úÎó¤ÏÃê½Ð¤ÎÂоݳ°¤È¤·¤¿¤¿¤á, Ãê½Ð¤µ¤ì¤¿Ï¢º¿·¿¶¦µ¯É½¸½¤Ï, ʸÆâ¤ËÊĤ¸¤Æ¤¤¤ë. ¤·¤¿¤¬¤Ã¤Æ, Î¥»¶·¿¶¦µ¯É½¸½¤òÃê½Ð¤¹¤ë¤Ë¤Ï, ¸À¸ì¥Ç¡¼¥¿¤òÀèÆ¬¤Îʸ¤«¤é½ç¤Ë¥µ¡¼¥Á¤·, Ï¢º¿·¿¶¦µ¯É½¸½¤Îʸ»úÎó¤ÎÁȤ¬1 Ê¸Ãæ¤Ë¸½¤ì¤ë¸½¾Ý¤ò, ʸ»úÎó¤ÎÁȤ´¤È¤Ë¥«¥¦¥ó¥È¤¹¤ì¤Ð¤è¤¤¤¬, ʸ¶³¦Ê¸»ú(¶çÅÀ) ¤Î°·¤¤¤ÈÃê½Ð¤¹¤ëɽ¸½¤Î°ÌÃÖ´Ø·¸¤¬ÌäÂê¤È¤Ê¤ë.
(1) ¶çÅÀ¤Î°·¤¤
Ä̾ï, ÆüËÜʸ¤Ï¶çÅÀ¤Ç½ª¤ï¤ë¤¿¤á, ¶çÅÀ¤«¤é¶çÅÀ¤Þ¤Ç¤ò1 ʸ¤È¤¹¤ë. °úÍÑʸÅù, 1 ʸÆâ¤Ë¶çÅÀ¤ò»ý¤ÄÊ̤Îʸ¤Ê¤É¤òÆâÊñ¤¹¤ëʸ¤Ç¤Ï, ´Êñ¤Î¤¿¤á, ÆâÊñ¤µ¤ì¤ëʸ(ÂФȤʤäƤ¤¤ë°úÍѵ¹æ¤Î¶è´Ö) ¤Ï̵»ë¤¹¤ë.
(2) Ãê½Ð¤¹¤ëʸ»úÎó¤ÎÁê¸ß´Ø·¸
Î¥»¶·¿¤Îʸ»úÎ󶦵¯¤Ç¤Ï, Ê¸Ãæ¤Ç, ¸ß¤¤¤ËÀܳ¤·¤¿Ê¸»úÎó¤äÉôʬŪ¤Ë¥ª¡¼¥Ð¥é¥Ã¥×¤¹¤ëʸ»úÎó¤ÎÁȤÏÃê½Ð¤ÎÂоݳ°¤È¤Ê¤ë. ¤½¤³¤Ç, 3 ¾Ï¤ÇÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÁê¸ß´Ø·¸¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë.
¤µ¤Æ, ʸ»úÎó¦Á¤È¦Â¤¬Æ±°ì¤Îʸ¤«¤éÃê½Ð¤µ¤ì¤¿Ï¢º¿·¿Ê¸»úÎó¤È¤¹¤ë¤È, ¤½¤Î¸¶Ê¸¾å¤Î°ÌÃÖŪ´Ø·¸¤Ï, ¿Þ5 ¤Ë¼¨¤¹¤è¤¦¤Ê»°¤Ä¤Î´Ø·¸¤Î¤¤¤º¤ì¤«¤È¤Ê¤ë. ʸ»úÎó¦Á¤È¦Â¤¬Ê¬Î¥¤·¤Æ¤¤¤ë(c) ¤Î¾ì¹ç¤Ï, ÅöÁ³, Î¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½ÐÂоݤˤʤ뤫¤é, ¤³¤³¤Ç¤Ï, (a), (b)¤Î¾ì¹ç¤Ë¤Ä¤¤¤Æ¹Í¤¨¤ë.
|
(a) ʸ»úÎó¦Á¤È¦Â¤¬Àܳ¤·¤Æ¤¤¤ë¾ì¹ç
¸À¸ì¥Ç¡¼¥¿Ãæ, ¤³¤Î¤è¤¦¤Êʸ»úÎó¤ò´Þ¤à¾ì½ê¤Ï, ºÇÂç1 ¥«½ê¤Ç¤¢¤ë. ¤Ê¤¼¤Ê¤é, ¤½¤Î¤è¤¦¤Êʸ»úÎó¤ò´Þ¤à¾ì½ê¤¬2 ¥«½ê°Ê¾å¤¢¤ë¾ì¹ç¤Ï, ʸ»úÎó¦Á¦Â¤¬¤è¤êʸ»ú¿ô¤Î¿¤¤Ê¸»úÎó¦Ã ¤È¤·¤Æ½¸·×¤µ¤ì, ¤½¤ì¤é¤ÎÊ¸Ãæ¤ÎÉôʬʸ»úÎó¦Á¤ª¤è¤Ó¦Â¤Ï¥«¥¦¥ó¥È¤µ¤ì¤Ê¤¤¤«¤é¤Ç¤¢¤ë. ¤·¤¿¤¬¤Ã¤Æ, ʸ»úÎó¦Á¤È¦Â¤¬Ê¸Ãæ¤Ë¶¦µ¯¤¹¤ë²ó¿ô¤¬2 ²ó°Ê¾å¤¢¤ë¾ì¹ç¤Ï, ºÇÂç1 ʸ¤ò½ü¤¯Â¾¤Î³ºÅö¤¹¤ëʸ¤Ï (c) ¤Î¥¿¥¤¥×(ʬΥ·¿) ¤Î¶¦µ¯¤È¤Ê¤Ã¤Æ¤¤¤ë. ¤³¤Î¾ì¹ç, (a) ¤Î¥¿¥¤¥×¤Î¶¦µ¯¤Ï, Ä̾ï, ʬΥ·¿¤Ç¶¦µ¯¤¹¤ëʸ»úÎ󤬤¿¤Þ¤¿¤ÞÀܳ¤·¤¿¤â¤Î¤È¤ß¤Ê¤»¤ë¤«¤é, Î¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½ÐÂоݤȤʤë.
(b) ʸ»úÎó¦Á¤È¦Â¤¬¥ª¡¼¥Ð¥é¥Ã¥×¤·¤Æ¤¤¤ë¾ì¹ç
ʸ»úÎó¦Á¤È¦Â¤òÊñ´Þ¤¹¤ëʸ»úÎó¤ò¦Ã ¤È¤¹¤ë. Á°¹à¤ÈƱÍÍ, ¤³¤Î¤è¤¦¤Êʸ»úÎó¦Ã ¤¬, ¸À¸ì¥Ç¡¼¥¿Æâ¤Ë2 ¥«½ê°Ê¾å½Ð¸½¤·¤¿¾ì¹ç¤Ï, ¦Ã ¼«¿È¤¬Ï¢º¿·¿¶¦µ¯É½¸½¤ÎÃê½Ð¤ÎÂоݤȤʤê, ¤½¤ÎÉôʬ¤Ë´Þ¤Þ¤ì¤¿Ê¸»úÎó¦Á ¤È¦Â¤Ï, Ãê½Ð¤µ¤ì¤Ê¤¤. ¤·¤¿¤¬¤Ã¤Æ, ¸¶Ê¸Ãæ, (b) ¤Î¤è¤¦¤Ê´Ø·¸¤Ë¤¢¤ëʸ»úÎó¦Á¤È¦Â¤¬Ãê½Ð¤µ¤ì¤¿Ê¸¤Ï, ¹â¡¹1 ʸ¤Ë¸Â¤é¤ì, ¦Á¤È¦Â¤¬¶¦µ¯¤¹¤ë»Ä¤ê¤Îʸ¤Ï, ¤¤¤º¤ì¤â (c) ¤Î¥¿¥¤¥×¤Î¶¦µ¯¤Ç¤¢¤ë. ¤·¤«¤·, ¤³¤Î¾ì¹ç¤Ï, (b) ¤Î¥¿¥¤¥×¤Î¦Á¤È¦Â¤ÏÊ¸Ãæ¤Î¶¦µ¯¤È¤Ï¸À¤¨¤Ê¤¤¤«¤é, Ãê½Ð½¸·×¤ÎÂоݤȤʤé¤Ê¤¤.
°Ê¾å¤«¤é, ʸÆâ¤ÎÎ¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½Ð¤Ë¤ª¤¤¤Æ¤Ï, (b) ¤Î¥¿¥¤¥×¤Î¶¦µ¯¤Î¤ß¤òÃê½ÐÂоݳ°¤È¤¹¤ì¤Ð¤è¤¤.
(3) ɽ¸½Í×ÁǤνи½½ç½ø¤Î°·¤¤
Î¥»¶·¿¶¦µ¯É½¸½¤Ç¤Ï, ¤½¤ì¤ò¹½À®¤¹¤ëɽ¸½Í×ÁÇ( ¤³¤³¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯É½¸½¤È¤·¤ÆÃê½Ð¤µ¤ì¤¿Éôʬʸ»úÎó)¤Î½Ð¸½½ç½ø¤Ï°ÕÌ£¤ò»ý¤Ä¤¿¤á, ½Ð¸½½ç½ø¤ò¶èÊ̤·¤ÆÃê½Ð¤·½¸·×¤¹¤ë.
(1) ¥¢¥ë¥´¥ê¥º¥à ( ¿Þ6 »²¾È)
[Á°½àÈ÷]
ºÆ³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¾å¤ÎÏ¢º¿·¿¶¦µ¯É½¸½¤È¤·¤ÆÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Ëʸ»úÎóÈÖ¹æ¤òÉÕÍ¿¤¹¤ë.
¼ê½ç9 : ¡ÖºÆ³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ÎºÆ¥½¡¼¥È¡×
ºÆ³ÈÄ¥ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ò¸¶Ê¸ÈÖÃϤÎÃͤνç¤Ë¥½¡¼¥È¤·, ³ÈÄ¥¸¶Ê¸ÈÖÃÏ¥Õ¥¡¥¤¥ë¤Î¥ì¥³¡¼¥É½ç¤ËÌ᤹.
¼ê½ç10 : ¡ÖʸÈÖ¹æ¤ÎÉÕÍ¿¡×
ÆÀ¤é¤ì¤¿¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤ËʸÈÖ¹æ¤òµÆþ¤¹¤ë.
¼ê½ç11 : ¡Ö¥Õ¥¡¥¤¥ë¤Î°µ½Ì¡×
¾åµ¥Õ¥¡¥¤¥ë¤ò°Ê²¼¤Î¼ê½ç¤Ç°µ½Ì¤·, ¡ÖÎ¥»¶·¿¶¦µ¯°µ½Ì¥Õ¥¡¥¤¥ë¡×¤òºîÀ®¤¹¤ë(¼¡¤Î¼ê½ç¤ËÈ÷¤¨¤Æ, ÉÔÍפʺî¶ÈÎΰè¤ò³«Êü¤¹¤ë).
| ¡ | ʸÈÖ¹æ, ʸ»úÎóÈÖ¹æ, Ãê½Ðʸ»ú¿ô, ¸¶Ê¸ÈÖÃϤλͤĤÎÍó°Ê³°¤Ï, ºï½ü¤¹¤ë. | |
| ¢ | ʸ»úÎóÈÖ¹æ¤ÎÍó¤ÎÃͤ¬¤Ê¤¤¥ì¥³¡¼¥É¤òºï½ü¤¹¤ë. |
¼ê½ç12 : ¡ÖÎ¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½Ð¤È¥«¥¦¥ó¥È¡×
°ìÈ̤Ë, k: ¼ïÎà(k ¡æ2)¤Îʸ»úÎ󤫤é¤Ê¤ëÎ¥»¶·¿¶¦µ¯É½¸½¤òÃê½Ð¤¹¤ë¤â¤Î¤È¤¹¤ë¤È, Ʊ°ì¤ÎʸÆâ¤Ë¤¢¤ëʸ»úÎóÈÖ¹æ¤Îk ¸Ä¤ÎÁȤ߹ç¤ï¤»¤Î¤¹¤Ù¤Æ¤ò (Ê¸Ãæ¤Î½Ð¸½½ç½ø¤Î½ç¤Ë¥»¥Ã¥È¤Ë¤¹¤ë) ¥Õ¥¡¥¤¥ë¤Ë½ñ¤½Ð¤·, ¤½¤ì¤ò¥½¡¼¥È¤·¤Æ, Ʊ°ì¤ÎÁȤοô¤ò¥«¥¦¥ó¥È¤¹¤ë.
°Ê¾å¤Ç, Î¥»¶·¿¶¦µ¯É½¸½¤Î½¸·×ɽ¤¬µá¤á¤é¤ì¤ë. ¤³¤ì¤é¤Îɽ¸½¤ò´Þ¤àʸ¤ò½ÐÎϤ¹¤ë¤Ë¤Ï, ¼ê½ç12 ¤ÇºîÀ®¤¹¤ë³ÆÉ½¸½¤ÎÁȤËʸÈÖ¹æ¤òÄɵ¤·¤Æ¤ª¤±¤Ð¤è¤¤.
(2) ÎãÂ긡Ƥ
°Ê¾å¤Î¼ê½ç¤ò, ¿Þ4 ¤ÎÎã¤ËŬÍѤ·, Í×ÁÇ¿ô2 ¤ÎÎ¥»¶·¿¶¦µ¯É½¸½¤òµá¤á¤¿. ¤½¤Î·ë²Ì¤ò¿Þ6 ¤Ë¼¨¤¹. ¤³¤ÎÎã¤Ç¤Ï, 3 ¾Ï¤ÇÃê½Ð¤µ¤ì¤¿5 ¼ï¤ÎÏ¢º¿·¿¶¦µ¯Ê¸»úÎó25ÁÈÃæ, 1 ʸÆâ¤ËÎ¥¤ì¤Æ2 ²ó°Ê¾å, ¶¦µ¯¤¹¤ëʸ»úÎó¤ÎÁȤ¬6 ÁȤÇ, ¤½¤ì¤é¤Î±ä¤Ù½Ð¸½²ó¿ô¤Ï12 ²ó¤Ç¤¢¤ë.
¿Þ6 Î¥»¶·¿¶¦µ¯É½¸½Ãê½Ð¥¢¥ë¥´¥ê¥º¥à¼Â»ÜÎã (¿Þ4 ¤Î¼ê½ç7 ¤«¤é³¤¯)
Fig. 6 Example of Interrupted collocational substring extraction (Follows from Fig. 4). 5. ¶¦µ¯É½¸½¤ÎÃê½Ð¼Â¸³ËÜÏÀʸ¤ÇÄ󰯤·¤¿ÊýË¡¤Î¸ú²Ì¤ò¸¡¾Ú¤¹¤ë¤¿¤á, ÆüËܸì¥Ç¡¼¥¿¤Ø¤ÎŬÍÑÎã¤È¤·¤Æ, Æü·Ð¿·Ê¹µ»ö3 ¥«·îʬ(892 Ëü»ú) ¤òÂоݤË, Ï¢º¿·¿¶¦µ¯Ê¸»úÎ󤪤è¤ÓÎ¥»¶·¿¶¦µ¯Ê¸»úÎó¤ÎÃê½Ð¼Â¸³¤ò¹Ô¤Ã¤¿. ¤¿¤À¤·, ÆÉÅÀ¤ò½ü¤¯µ¹æÎà¤ò´Þ¤àʸ»úÎó¤ÏÃê½Ð¤ÎÂоݤȤ·¤Ê¤¤¤³¤È¤È¤·¤¿. »ÈÍѤ·¤¿·×»»µ¡¤Ï, XEROX ARGOSS 5270 (SUN OS4.1.3) ¤Ç, »ÈÍѤ·¤¿¥á¥â¥êÎ̤ϺÇÂç48 MB ¤Ç¤¢¤ë. ËܾϤÇ, ÆÀ¤é¤ì¤¿Ê¸»úÎó¤ÎÆÃħ¤È½èÍý»þ´Ö¤Ë¤Ä¤¤¤Æ½Ò¤Ù¤ë. 5.1 Ï¢º¿·¿¶¦µ¯É½¸½¤ÎÃê½Ð(1) Ãê½Ðʸ»úÎó¤ÎÀ¼Á Ãê½Ð¤¹¤ëʸ»ú¿ô¤â¤·¤¯¤Ïʸ»úÎó¤Î½Ð¸½²ó¿ô¤òÀ©¸Â¤·¤¿¾ì¹ç¤Ë, Ãê½Ð¤µ¤ì¤ëʸ»úÎó¤Î¼ïÎà¿ô¤È±ä¤Ù½Ð¸½²ó¿ô¤ò½¾Íè¤ÎÊýË¡¤ÈÈæ³Ó¤·¤Æ, ɽ1, ɽ2 ¤Ë¼¨¤¹. ʸ»úÎó¤ÎŤµ¤«¤é¸«¤¿, Ãê½Ð¤µ¤ì¤ëʸ»úÎó¤Î¼ïÎà¿ô, ½Ð¸½²ó¿ô¤ª¤è¤Óʸ»úÎó¤ÎÎã¤òɽ3 ¤Ë¼¨¤¹. ¤Þ¤¿, ½Ð¸½ÉÑÅ٤ι⤤ʸ»úÎó¤ÎÎã¤òɽ4 ¤Ë¼¨¤¹.
ɽ1 Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Î¼ïÎà¤È±ä¤ÙÅÙ¿ô ( ¤½¤Î1* )
Table 1 Number of extracted substrings and their total frequency (No.1 ).
* ʸ»úÎó¤ÎŤµ¤«¤é¸«¤¿½¸·×
ɽ2 Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Î¼ïÎà¤È±ä¤ÙÅÙ¿ô ( ¤½¤Î2* )
Table 2 Number of extracted substrings and their total frequency (No.2).
* ½Ð¸½ÉÑÅÙ¤«¤é¸«¤¿½¸·×
ɽ3 Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÎã ( ½Ð¸½²ó¿ô¤Î¿¤¤½ç¤Ë·ÇºÜ: ( ) Æâ¤Ï½Ð¸½²ó¿ô)
Table 3 Examples of extracted substrings (in the order of frequency).
ɽ4 ½Ð¸½ÉÑÅ٤ι⤤ʸ»úÎó¤ÎÎã
Table 4 Examples of substrings with high frequency.
¤³¤ì¤é¤Îɽ¤«¤é, °Ê²¼¤Î¤³¤È¤¬Ê¬¤«¤ë. ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, ´üÂÔ¤µ¤ì¤¿¤È¤ª¤ê, ½¾Íè¤ÎÊýË¡¤ËÈæ¤Ù¤Æ, ¿¤¯¤ÎÃÇÊÒŪ¤Êʸ»úÎó¤ÎÃê½Ð¤¬²¡À©¤µ¤ì, Ãê½Ð¤µ¤ì¤ëʸ»úÎó¤Î¼ïÎà, ½Ð¸½²ó¿ô¶¦¤ËÂçÉý¤Ë¸º¾¯¤¹¤ë. Î㤨¤Ð, 2ʸ»ú°Ê¾å, 2²ó°Ê¾å¤Îʸ»úÎó¤Ç¤Ï, Ãê½Ð¤µ¤ì¤ë¼ïÎबÌó5ʬ¤Î1, ±ä¤Ù½Ð¸½²ó¿ô¤âÌó12ʬ¤Î1¤ËÍÞÀ©¤µ¤ì¤ë. ¤³¤Î¸ú²Ì¤Ï, ʸ»ú¿ô¤ÎÂ礤¤Ê¸»úÎó¤Û¤ÉÂ礤¯, 20ʸ»ú°Ê¾å¤Î¾ì¹ç¤Ç¤Ï, Ãê½Ð¤µ¤ì¤ëʸ»úÎó¤Î¼ïÎà, ±ä¤Ù½Ð¸½²ó¿ô¶¦¤Ë, Ìó100ʬ¤Î1¤Ë¤Ê¤ë. (2) ½èÍý»þ´Ö¤Ë¤Ä¤¤¤Æ ºÇ½é¤Îʸ»úÎóñ¸ì¤Î¥½¡¼¥È(ÈÆÍÑ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤ÎºîÀ®)¤Ë, ºÇ¤â¿¤¯¤Î»þ´Ö(¥¿¡¼¥ó¥¢¥é¥¦¥ó¥É»þ´Ö: Ìó40»þ´Ö, CPU»þ´Ö: Ìó10»þ´Ö)¤¬¤«¤«¤Ã¤¿¡ù¤¬, ¤½¤Î¸å¤Î½èÍý¤Ï, ¤½¤ì¤ËÈæ¤Ù¤Æ¤¤ï¤á¤ÆÃ»»þ´Ö(Ʊ: 34ʬ, Ʊ: 16ʬ)¤Ç¤¢¤Ã¤¿. 5.2 Î¥»¶·¿¶¦µ¯Ê¸»úÎó¤ÎÃê½Ð(1) Ãê½Ðʸ»úÎó¤ÎÀ¼Á ´Êñ¤Î¤¿¤á, ñÆÈ¤Ç¤Ï¤½¤ì¤¾¤ì10²ó°Ê¾å½Ð¸½¤·¤¿2¼ïÎà¤Îʸ»úÎó¤¬1ʸÆâ¤ËÎ¥¤ì¤Æ¶¦µ¯¤¹¤ë ¾ì¹ç¡ù¤Ë¤Ä¤¤¤Æ, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÁȤοô¤òɽ5¤Ë¼¨¤¹. ¤Þ¤¿, ½Ð¸½ÉÑÅ٤ο¤¤Ê¸»úÎó¤ÎÁȤÈ, 2²ó°Ê¾å½Ð¸½¤·¤¿Ê¸»úÎó¤ÎÁȤÎÃæ¤Ç¹ç·×ʸ»ú¿ô¤Î¿¤¤Ê¸»úÎó¤ÎÁȤò, ¤½¤ì¤¾¤ì, ɽ6, ɽ7¤Ë¼¨¤¹.
ɽ5 Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ÎÁȤμïÎà¤È±ä¤ÙÅÙ¿ô
Table 5 Characteristics of extracted pairs of substrings.
(2¼ïÎà¤Îʸ»úÎó¤ÎʸÆâ¶¦µ¯¤Î¾ì¹ç)
ɽ6 ½Ð¸½ÉÑÅ٤ι⤤ʸ»úÎó¤ÎÁȤÎÎã
Table 6 Pairs of substrings with high frequency.
ɽ7 ¹ç·×ʸ»ú¿ô¤ÎÂ礤¤Ê¸»úÎó¤ÎÎã
Table 7 Pairs of longest substrings.
ɽ6, ɽ7¤«¤é, ½Ð¸½ÉÑÅ٤ι⤤Υ»¶·¿¶¦µ¯¤Î¿¤¯¤Ï, ̾»ìƱ»Î¤Î¶¦µ¯¤Ç¤¢¤ë¤³¤È¤¬Ê¬¤«¤ë. ÆÃ¤Ë, ÏÃÂê¤È¤·¤Ç¿·Ê¹µ»ö¤Ë¼è¤ê¾å¤²¤é¤ì¤¿¸ÇÍ̾»ì¤äÆü»þÅù¤Î¿ôÎ̤Ȥ榵¯¤¬¿ô¿¤¯ ¼è¤ê½Ð¤µ¤ì¤Æ¤¤¤ë¡ù¡ù. ¤³¤Î¤è¤¦¤Ê̾»ì¤Î¶¦µ¯¾ðÊó¤Ï, Î㤨¤Ð, µ¡³£ËÝÌõÍѤμ½ñºîÀ®¤Ê¤É¤Ë±þÍѤǤ¤ë. ¤Þ¤¿, ¥Æ¥ó¥×¥ì¡¼¥ÈËÝÌõ¤Ê¤É¤Ç¤Ï, ̾»ìƱ»Î¤Î¶¦µ¯¤è¤ê¤â¤à¤·¤í, ʸ·¿¥Ñ¥¿¡¼¥ó¤òºî¤ê°×¤¤½õ»ì¤ä½õư»ì¤ò´Þ¤àɽ¸½Í×ÁǤ榵¯¤ò¼ý½¸¤·¤¿¤¤¾ì¹ç¤¬¤¢¤ë. ɽ5¤ò¸«¤ë¤È, Ãê½Ð¤µ¤ì¤¿É½¸½¤ÎÁȤÏ, ¤¹¤Ç¤Ë¤«¤Ê¤ê¹Ê¤ê¹þ¤Þ¤ì¤Æ¤¤¤ë¤¿¤á(Á´ÂΤÇ, 6,544·ï), Á´ÂΤò¿Í¼ê¤Ë¤è¤Ã¤Æ¥Á¥§¥Ã¥¯¤·, ½õ»ì, ½õư»ì¤ò´Þ¤àɽ¸½¤ÎÁȤʤÉ, ÌÜŪ¤Ë±þ¤¸¤¿É½¸½¤ÎÁȤòÁªÂò¤·¤Æ¼è¤ê½Ð¤¹¤³¤È¤Ï¤µ¤Û¤Éº¤Æñ¤Ç¤Ï¤Ê¤¤. ¤·¤«¤·, ¤µ¤é¤ËÂçÎ̤θÀ¸ì¥Ç¡¼¥¿¤Î¾ì¹ç, ½ÐÎϤµ¤ì¤ë¥Ç¡¼¥¿Î̤¬ÁýÂ礷, ¿Í¼ê¤Ë¤è¤ëÁªÂò¤¬º¤Æñ¤È¤Ê¤ë¤³¤È¤¬¹Í¤¨¤é¤ì¤ë. ¤½¤Î¤è¤¦¤Ê¾ì¹ç, ÆÀ¤é¤ì¤¿·ë²Ì¤«¤éÌÜŪ¤Ë¤¢¤ï¤Ê¤¤¤è¤¦¤Êɽ¸½¤òÁªÂòŪ¤Ëºï½ü¤¹¤ëÊýË¡¤â¤¢¤ë¤¬, Ï¢º¿·¿¶¦µ¯, Î¥»¶·¿¶¦µ¯¤ÎÃê½Ð½èÍý¤Î²áÄø¤Ë²ðÆþ¤·¤Æ, Ãê½ÐÂоÝʸ»úÎó¤ËÀ©¸Â¤ò²Ã¤¨¤ë¤³¤È¤â¤Ç¤¤ë. ¤Ê¤ë¤Ù¤¯ÁᤤÃʳ¬¤Ç, Ãê½ÐÂоݤȤ¹¤ëʸ»úÎó¤Î»ú¼ï¹½À®¤ËÀ©Ìó¤ò²Ã¤¨¤¿¤ê, Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤ò¥Á¥§¥Ã¥¯¤·¤Æ, ÉÔÍפʤâ¤Î¤òÁ°½ü¤·¤¿¤ê¤¹¤ì¤Ð, ¤½¤Î¸å¤Î·×»»Î̤ϸº¾¯¤·, ½ÐÎÏ·ë²Ì¤Î¥Á¥§¥Ã¥¯ºî¶È¤â¸º¾¯¤¹¤ë¡ù¡ù¡ù. ¤³¤³¤Ç¤Ï°ìÎã¤È¤·¤Æ, ¤Ò¤é¤¬¤Êʸ»ú¤ò´Þ¤Þ¤Ê¤¤Ê¸»úÎó¤Èµ¹æ±Ñ¿ô»ú¤ò´Þ¤àʸ»úÎó¤Ï Ãê½Ð¤·¤Ê¤¤¤È¤¤¤¦¾ò·ï¤ÇÆÀ¤é¤ì¤¿Î¥»¶·¿¶¦µ¯¤Î·ë²Ì¤Î °ìÉô¤òɽ8, ɽ9¤Ë¼¨¤¹. ¤³¤Î¾ì¹ç, ¿·Ê¹µ»ö¤Îʸ·¿¤ËÁêÅö¤¹¤ë¤è¤¦¤Ê, Î¥»¶·¿¶¦µ¯É½¸½¤¬Ãê½Ð¤µ¤ì¤ë¤³¤È¤¬Ê¬¤«¤ë.
ɽ8 Ãê½Ð¤µ¤ì¤¿Î¥»¶·¿¶¦µ¯É½¸½¤ÎÎã (( )Æâ¤Ï½Ð¸½²ó¿ô)
Table 8 Interrupted collocational expressions (in the order of frequency).
(¤Ò¤é¤¬¤Ê¤ò´Þ¤ß, µ¹æ±Ñ¿ô»ú¤ò´Þ¤Þ¤Ê¤¤Í×ÁǤòÃê½Ð¤·¤¿·ë²Ì)
ɽ9 ¹ç·×ʸ»ú¿ô¤Î¿¤¤Î¥»¶·¿¶¦µ¯É½¸½¤ÎÎã
Table 9 Interrupted collocational expressions (in the order of total length).
(¤Ò¤é¤¬¤Ê¤ò´Þ¤ß, µ¹æ±Ñ¿ô»ú¤ò´Þ¤Þ¤Ê¤¤Í×ÁǤòÃê½Ð¤·¤¿·ë²Ì)
(2) ¸À¸ì¥Ç¡¼¥¿Î̤ȽèÍý¥µ¥¤¥º¤Ë¤Ä¤¤¤Æ Î¥»¶·¿¶¦µ¯¤Î¾ì¹ç¤Ï, Ï¢º¿·¿¶¦µ¯¤ÇÃê½Ð¤·¤¿É½¸½¤ÎÁȤò°·¤¦¤¿¤á, ɽ¸½¤ÎÁȤò½ñ¤½Ð¤¹¤¿¤á¤Î¥Õ¥¡¥¤¥ë¤ÎÍÆÎ̤¬ÌäÂê¤È¤Ê¤ë¤ÈͽÁÛ¤µ¤ì¤ë. ¤³¤Î¥Õ¥¡¥¤¥ë¤ÎɬÍ×Î̤Ï, Î¥»¶·¿¶¦µ¯¤È¤·¤ÆÀ¸µ¯¤·¤¿É½¸½¤Î¿ô(ÉÑÅÙ1°Ê¾å¤Î±ä¤ÙÅÙ¿ô)¤Ç·è¤Þ¤ë. ¼Â¸³Îã¤Ë¤è¤ì¤Ð, Ï¢º¿·¿¶¦µ¯¤ÇÃê½Ð¤·¤¿É½¸½97Ëü¼ïÎà(±ä¤ÙÅÙ¿ô260Ëü²ó)¤òÂÀÚ¤ê¤ò¤»¤º (¤¿¤À¤·ÅÙ¿ô1¤Î¤â¤Î¤Ï½ü¤¯), ¤½¤Î¤Þ¤Þ»ÈÍѤ·¤ÆÍ×ÁÇ¿ô2¤ÎÎ¥»¶·¿¶¦µ¯¤ò·×»»¤¹¤ë¤È, ÅÙ¿ô2°Ê¾å¤ÎÎ¥»¶·¿É½¸½¤È¤·¤Æ, 18Ëü¼ïÎà(±ä¤ÙÅÙ¿ô40Ëü²ó)¤Îɽ¸½¤ÎÁȤ¬ÆÀ¤é¤ì¤¿. ¤³¤Î¤È¤, ¼ê½ç12¤Ç¥Õ¥¡¥¤¥ë¤Ë½ñ¤½Ð¤µ¤ì¤¿É½¸½¤ÎÁÈ(¤¿¤À¤·Ê¸»úÎóÈÖ¹æ¤Î¥Ú¥¢)¤Ï, 2,000ËüÁȤÇ, ¤½¤ì¤ËÍפ·¤¿¥Õ¥¡¥¤¥ëÎ̤Ï400MB(20¥Ð¥¤¥È/ʸ»úÎó¥Ú¥¢)¤Ç¤¢¤Ã¤¿. ¤³¤ì¤ËÂФ·¤Æ, Ï¢º¿·¿¶¦µ¯¤È¤·¤ÆÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Î¤¦¤Á, ÅÙ¿ô10°Ê¾å¤Î¤â¤Î1.2Ëü¼ïÎà(±ä¤ÙÅÙ¿ôÌó22Ëü²ó)¤ò¼è¤ê¾å¤², ¤½¤ì¤é¤òÍ×ÁǤȤ¹¤ëÎ¥»¶·¿¶¦µ¯É½¸½¤òµá¤á¤¿¾ì¹ç¤Ï, 2ÅÙ¿ô°Ê¾å¤ÎÎ¥»¶·¿¶¦µ¯É½¸½¤È¤·¤Æ, 6,500¼ïÎà(±ä¤ÙÅÙ¿ôÌó2Ëü²ó)¤Îɽ¸½¤¬ÆÀ¤é¤ì¤¿. ¤³¤Î·×»»¤Î²áÄø¤Ç¥Õ¥¡¥¤¥ë¤Ë½ñ¤½Ð¤µ¤ì¤¿É½¸½¤ÎÁȤÏ, Ìó58ËüÁȤÇ, »ÈÍѤ·¤¿¥Õ¥¡¥¤¥ëÎ̤ÏÌó12MB¤Ç¤¢¤ê, ÂÀÚ¤ê¤ò¤·¤Ê¤¤¾ì¹ç¤ËÈæ¤Ù¤Æ, 1/30°Ê²¼¤Ë¸º¾¯¤·¤¿. ¤³¤³¤Ç, ¸À¸ì¥Ç¡¼¥¿Î̤ȽèÍý¥µ¥¤¥º¤Î´Ø·¸¤ò¹Í¤¨¤ë. Ï¢º¿·¿¶¦µ¯¤ÇÃê½Ð¤µ¤ì¤ëɽ¸½¤Î±ä¤ÙÅÙ¿ô¤Ï, ¸À¸ì¥Ç¡¼¥¿Î̤ˤۤÜÈæÎ㤷, Î¥»¶·¿¶¦µ¯¤ÇÃê½Ð¤µ¤ì¤ëɽ¸½¤Î±ä¤ÙÅÙ¿ô¤Ï, Ï¢º¿·¿¶¦µ¯¤ÇÆÀ¤é¤ì¤¿É½¸½¤Î±ä¤ÙÅÙ¿ô¤Î2¾è¤Ë¤Û¤ÜÈæÎ㤹¤ë¤È¹Í¤¨¤é¤ì¤í¤«¤é, Î¥»¶·¿¶¦µ¯½¸·×ÍѤΥե¡¥¤¥ë»ÈÍÑÎ̤Ï, ¸À¸ì¥Ç¡¼¥¿Î̤Î2¾è¤Ë¤Û¤ÜÈæÎ㤹¤ë¤È¿äÄꤵ¤ì¤ë. ¤·¤«¤·, ¸À¸ì¥Ç¡¼¥¿Î̤¬Áý²Ã¤·¤¿¤È¤¤Ï, ¤½¤ì¤ËÈæÎ㤷¤ÆÏ¢º¿·¿¶¦µ¯É½¸½¤ÎÂÀÚ¤êÃͤò¾å¤²¤Æ¤âÃê½ÐÀºÅÙ¤ÏÄã²¼¤»¤º, ½ÅÍפÊ(ÉÑÅ٤ι⤤)ɽ¸½¤Ïϳ¤ì¤Ê¤¯¼ý½¸¤Ç¤¤ë¤È ´üÂÔ¤µ¤ì¤ë¡ù. ¤½¤³¤Ç, ɽ2¤ò¸«¤ë¤È, ÂÀÚ¤êÃͤˤۤÜÈ¿ÈæÎ㤷¤Æ, Ãê½Ð¤µ¤ì¤ëÏ¢º¿·¿¶¦µ¯É½¸½¤Î±ä¤ÙÅÙ¿ô¤Ï¸º¾¯¤·¤Æ¤¤¤ë¤³¤È¤¬Ê¬¤«¤ë. ¤³¤ì¤é¤ÎÅÀ¤«¤é, ¸¶Ê¸¥Ç¡¼¥¿Î̤¬Áý²Ã¤·¤¿¤È¤¤Ï, ÂÀÚ¤êÃͤò¤½¤ì¤ËÈæÎ㤷¤Æ¾å¤²¤ë¤³¤È¤Ë¤è¤ê, Ãê½ÐÀºÅÙ¤òÄã²¼¤µ¤»¤Ê¤¤¤ÇÎ¥»¶·¿¶¦µ¯¤Î·×»»¤¬¤Ç¤, ¤½¤Î¤È¤, ·×»»¤ËɬÍפȤµ¤ì¤ë¥Õ¥¡¥¤¥ëÎ̤ÎÁý²Ã¤Ï, ¸À¸ì¥Ç¡¼¥¿¤ÎÁý²Ã¤ËÈæÎ㤹¤ë¥ª¡¼¥À¤ËÍÞ¤¨¤é¤ì¤ë¤È´üÂԤǤ¤ë. 5.3 º£¸å¤Î²þÎɤȱþÍѤˤĤ¤¤Æ(1) ÌÜŪ¤Ë¹ç¤ï¤»¤¿Ãê½Ðʸ»úÎó¼ïÊ̤λØÄê Î¥»¶·¿¤Î¶¦µ¯É½¸½Ãê½Ð¤Î¾ì¹ç, ·×»»²Äǽ¤Ê¸À¸ì¥Ç¡¼¥¿Î̤òÁýÂ礵¤»¤ë¤¿¤á¤Ë¤Ï, ÆÃ¤Ë, ¤½¤ì¤Ë»ÈÍѤ¹¤ëÏ¢º¿·¿Ê¸»úÎó¤Î¼ïÎà¤ò¾¯¤·¤Ç¤â¸º¾¯¤µ¤»¤ë¤³¤È¤¬Ë¾¤Þ¤ì¤ë. ¤³¤ì¤ËÂФ·¤Æ, ¼Â¸³Îã¤ÇÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤Ë¤Ï, ¤Þ¤À, ÍÍ¡¹¤Ê¼ïÎà¤Îʸ»úÎ󤬺®¤¶¤Ã¤Æ¤¤¤ë. ÆüËܸì¥Ç¡¼¥¿¤Î¾ì¹ç, Î㤨¤Ð,
(2) ÍÞÀ©¤µ¤ì¤¿Ê¸»úÎ󥫥¦¥ó¥È¤Î°ìÉôÉü³è ËÜÏÀʸ¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯¤Î·×»»¤Ë¤ª¤¤¤Æ, °ìÅÙÃê½Ð¤·¤¿Ê¸»úÎóÆâ¤ÎÉôʬʸ»úÎó¤ÎÃê½Ð¤Ï¥À¥Ö¥ë¥«¥¦¥ó¥È¤Ë¤Ê¤ë¤È¹Í¤¨, ¾ò·ï3(ºÇŰìÃפΤâ¤Î¤Î¤ßÃê½Ð)¤òÁ°Äó¤È¤·¤¿. ¤³¤Î¤¿¤á, Ãê½Ð¤µ¤ì¤ëʸ»úÎó¤Ï, ÆÈΩÀ¤¬¤¢¤ê, Ï¢º¿¶¦µ¯¤È¤ß¤Ê¤»¤ëʸ»úÎó¤Ë¹Ê¤é¤ì¤Æ¤¤¤ë. ¤·¤«¤·, ¤è¤êºÙ¤«¤¤Í×ÁǤ«¤é¤Ê¤ëÎ¥»¶·¿¶¦µ¯¤ò¤â¼ý½¸¤·¤è¤¦¤È¤¹¤ë¾ì¹ç¤Ï, °ìÅÙÃê½Ð¤·¤¿Ê¸»úÎó¤ÎÃæ¤ÎÍ×ÁǤ«¤é¤â, Í×ÁÇŪ¤Êɽ¸½¤òÃê½Ð¤¹¤ì¤Ð¤è¤¤. ÃÇÊÒŪ¤Êʸ»úÎó¤ÎÃê½Ð¤òÍÞÀ©¤·¤Ê¤¬¤é, ¤³¤ì¤é¤ÎÍ×ÁÇŪɽ¸½¤òÃê½Ð¤¹¤ë¤Ë¤Ï, ¿Þ1¤Ç, ʸ»úÎó¦Á¤ÎÃæ¤Ë´Þ¤Þ¤ì¤ëÉôʬʸ»úÎó¤Î¦Â¤ä¦Ã¤â, ¤½¤Îʸ»úÎ󤬸¶Ê¸Ãæ¤Î¾¤ÎÉôʬ¤ËÀ¸µ¯¤·¤ÆÃê½ÐÂоݤȤʤ俤Ȥ¤Ï, ¥«¥¦¥ó¥È¤Ë²Ã¤¨¤ë¤ÈÎɤ¤. ¶ñÂÎŪ¤Ë¤Ï, 3¾Ï¤Î¥¢¥ë¥´¥ê¥º¥à¤Î¼ê½ç8¤Ç, n-gram ¤Îʸ»úÎó¤òÃê½Ð¤¹¤ëºÝ, ¤½¤Î¥ì¥³¡¼¥É¤Î¾åÊý¸þ¤ËϢ³¤¹¤ë¥ì¥³¡¼¥É¤Ç, Ãê½Ðʸ»ú¿ô¤ÎÃͤ¬n+1 °Ê¾å¤Î¤â¤Î¤ân-gram ¤ÎÃê½ÐÂоݤ˲䨤ì¤Ð¤è¤¤. ¤½¤ÎºÝ, ¿·¤¿¤ËÃê½ÐÂоݤȤʤ俥쥳¡¼¥É(½ÅÊ£Ãê½Ð¤ÎÂоݥ쥳¡¼¥É)¤ò¥³¥Ô¡¼¤·¤ÆÄɲ䷤Ƥª¤±¤Ð, Î¥»¶·¿¶¦µ¯¤Î·×»»½èÍý¤Î¼êľ¤·¤ÏÉÔÍפȤʤë. (3) Î¥»¶·¿¶¦µ¯É½¸½Ãê½Ð¤Ë¤ª¤±¤ë1ʸ»úÍ×ÁǤΰ·¤¤ ËÜÏÀʸ¤Î¼Â¸³Îã¤Ç¤Ï, ·×»»Î̤ò¸º¾¯¤µ¤»¤ë¤¿¤á, Ãê½ÐÂоÝʸ»úÎó¤Îʸ»ú¿ô¤Ï2ʸ»ú°Ê¾å¤Ç¤¢¤ë¤È¤·¤¿. ¤·¤«¤·, Î¥»¶·¿¶¦µ¯É½¸½¤ÎÃê½Ð¤Ë¤ª¤¤¤Æ, ÆüËÜʸ¤Îʸ·¿¤òÃê½Ð¤·¤¿¤¤¤è¤¦¤Ê¾ì¹ç, ¡Ö¡Á¤¬¡Á¤ò¡Á¤Ë¡Á¡×¤Ê¤É¤Î¤è¤¦¤Ë, Ê¸Ãæ¤«¤é1ʸ»ú¥¡¼¥ï¡¼¥É¤ÎÁȤòõ¤·¤¿¤¤¾ì¹ç¤¬¤¢¤ë. ¤³¤Î¤è¤¦¤Ê¾ì¹ç¤Ï, ¸å¤Ë½Ò¤Ù¤ë¤è¤¦¤Ë, ·ÁÂÖÁDzòÀÏ·ë²Ì¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤òŬÍѤ¹¤ì¤Ð¤è¤¤¤È¹Í¤¨¤é¤ì¤ë¤¬, (1)¤Ç½Ò¤Ù¤¿ÊýË¡¤Ê¤É¤Ë¤è¤ê, Ãê½ÐÂоݤò¹Ê¤ê¹þ¤à¤³¤È¤Ë¤è¤Ã¤Æ·×»»Î̤ò¸º¤é¤·, Ãê½Ð¤ò²Äǽ¤È¤¹¤ë¤³¤È¤â¹Í¤¨¤é¤ì¤ë. (4) ·ÁÂÖÁÇÎó, ñ¸ìÎóÅù¤Ø¤ÎŬÍÑ ÆüËܸì¤Îʸ·¿¤òÃê½Ð¤¹¤ë¤Ë¤Ï, ¸À¸ì¥Ç¡¼¥¿¤ò·ÁÂÖÁDzòÀϤ·¤ÆÆÀ¤é¤ì¤¿Ã±¸ì¤ÎʸˡŪ°À¤ä°Ọ̃°À¤òɽ¤¹µ¹æÎó¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤òŬÍѤ¹¤ë¤³¤È¤¬´üÂÔ¤µ¤ì¤ë. ʸˡŪ, °Ọ̃Ū¤Ë¸«¤Æ¤É¤Î¤è¤¦¤Ê¼ïÎà¤Îʸ·¿¾ðÊ󤬯À¤é¤ì¤ë¤«, ¤Þ¤¿, ñ¸ì¶¦µ¯¾ðÊó¤òÆÀ¤ë¾ì¹ç, ʸ»úÏ¢º¿¤ËŬÍѤ¹¤ëÊýË¡¤È, ñ¸ìÎó¤ËŬÍѤ¹¤ëÊýË¡¤Î¤É¤Á¤é¤¬¤è¤¤¤«¤Ê¤É, º£¸å¤Î²ÝÂê¤Ç¤¢¤ë. 6. ¤¢¤È¤¬¤¸À¸ì¥³¡¼¥Ñ¥¹¤Ê¤É¤ÎËÄÂç¤Ê¸À¸ì¥Ç¡¼¥¿¤ÎÃæ¤«¤é, »ÈÍÑÉÑÅ٤ι⤤ɽ¸½¤ª¤è¤Óɽ¸½¤ÎÁȤò¼«Æ°Åª¤Ëȯ¸«¤·½¸·×¤¹¤ëÊýË¡¤òÄ󰯤·¤¿. ¶ñÂÎŪ¤Ë¤Ï, ¤Þ¤º, Ǥ°Õ¤În-gram ¤Î·×»»Ë¡¤È¤·¤ÆÄ󰯤µ¤ì¤¿Ä¹Èø¤é¤Î¥¢¥ë¥´¥ê¥º¥à¤ò ÆÈΩÀ¤Î¹â¤¤É½¸½¤òÃê½Ð¤¹¤ë´ÑÅÀ¤«¤é²þÎɤ·, ¸À¸ì¥Ç¡¼¥¿¤ÎÃæ¤Ë2²ó°Ê¾å½Ð¸½¤·¤¿Ê¸»úÎó(Ï¢º¿·¿¶¦µ¯É½¸½)¤ò, ¡Ö°ìÅÙ, Ãê½Ð¤·¤¿Ê¸»úÎó¤ÎÉôʬʸ»úÎó¤Ï, ¤½¤Î¸å, Ãê½ÐÂоݤȤ·¤Ê¤¤¡×¤È¤¤¤¦¾ò·ï²¼¤Ç, ϳ¤ì¤Ê¤¯¼«Æ°Åª¤ËÃê½Ð¤·½¸·×¤¹¤ëÊýË¡¤òÄ󰯤·¤¿. ¼¡¤Ë, ¤³¤ÎÊýË¡¤ÇÃê½Ð¤µ¤ì¤¿Ê¸»úÎó¤òÁȤ߹ç¤ï¤»¤Æ, Ê¸Ãæ¤ÎÎ¥¤ì¤¿°ÌÃ֤˶¦µ¯¤¹¤ëʸ»úÎó¤ÎÁÈ(Î¥»¶·¿¶¦µ¯É½¸½)¤òÃê½Ð¤·, ¤½¤ÎÉÑÅÙ¤òµá¤á¤ëÊýË¡¤ò¼¨¤·¤¿. 3¥«·îʬ¤Î¿·Ê¹µ»ö¥Ç¡¼¥¿(892Ëü»ú)¤ËŬÍѤ·¤¿Îã¤Ë¤è¤ì¤Ð, Ï¢º¿·¿¶¦µ¯É½¸½Ãê½Ð¤Î¾ì¹ç, ½¾Íè¤ÎÊýË¡¤Ç¤Ï, 2ʸ»ú°Ê¾å, 2ÅÙ¿ô°Ê¾å¤Îʸ»úÎó¤¬, 440Ëü¼ïÎà, ±ä¤Ù3,120Ëü²ó¤Îʸ»úÎó¤¬Ãê½Ð¤µ¤ì¤¿¤Î¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, 97Ëü¼ïÎà, ±ä¤Ù260Ëü·ï¤Ë¸º¾¯¤·¤¿. Ãê½Ð¤µ¤ì¤¿Ê¸»úÎó¤òÈæ³Ó¤·¤¿·ë²Ì, n-gram ¤ÎÊýË¡¤ÇÆÀ¤é¤ì¤¿Ê¸»úÎó¤¬, ËÄÂç¤ÊÎ̤ÎÃÇÊÒŪ¤Êʸ»úÎó(ʸˡŪ, °Ọ̃Ū¤Ë°ÕÌ£¤Î¤Ê¤¤Ê¸»úÎó)¤ò´Þ¤à¤Î¤ËÂФ·¤Æ, ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, ¤½¤ì¤é¤ÎÃÇÊÒŪ¤Êʸ»úÎó¤¬ÂçÉý¤Ëºï½ü¤µ¤ì¤ë¤³¤È¤¬³Îǧ¤µ¤ì¤¿. ¤³¤Î¸ú²Ì¤Ë¤è¤ê, Î¥»¶·¿¤Î¶¦µ¯É½¸½¤ÎÌÖÍåŪ¤Ê¼«Æ°Ãê½Ð¤¬²Äǽ¤È¤Ê¤Ã¤¿. Ä󰯤·¤¿Î¥»¶·¿¶¦µ¯É½¸½Ãê½ÐÊý¼°¤ÎŬÍÑÎã¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯¤Î½¸·×¤ÇÆÀ¤é¤ì¤¿Ê¸»úÎó¤Î¤¦¤Á, 10²ó°Ê¾å½Ð¸½¤·¤¿Ê¸»úÎó(12,350¼ïÎà)¤ÎǤ°Õ¤Î2¼ïÎब, 1Ê¸Ãæ¤Ë2²ó°Ê¾å¶¦µ¯¤·¤¿É½¸½¤ÎÁȤÏ, 6,500¼ïÎà(±ä¤Ù½Ð¸½²ó¿ô21,800²ó)¤Ç¤¢¤ë¤³¤È¤Ê¤É, Î¥»¶·¿¤Î¶¦µ¯É½¸½¤¬Íưפ˵á¤á¤é¤ì¤ë¤³¤È¤¬Ê¬¤«¤Ã¤¿. °Ê¾å¤Î¤È¤ª¤ê, ËÜÏÀʸ¤ÎÊýË¡¤Ç¤Ï, Ï¢º¿·¿¶¦µ¯É½¸½Ãê½Ð¤Ç¤ÎÃÇÊÒŪʸ»úÎó¤ÎÃê½Ð¤¬ÍÞÀ©¤µ¤ì¤ë·ë²Ì, Î¥»¶·¿¶¦µ¯É½¸½¤òÍÆ°×¤Ë·×»»¤¹¤ë¤³¤È¤¬²Äǽ¤È¤Ê¤ê, ʸ·¿¥Ñ¥¿¡¼¥ó¤Ê¤É, ʸ¹½Â¤¤Ë´Ø¤¹¤ë´ðÁåǡ¼¥¿¤ò, ¤Û¤Ü¼«Æ°Åª¤Ë¼ý½¸¤¹¤ë¤³¤È¤¬²Äǽ¤È¤Ê¤Ã¤¿. ËÜÏÀʸ¤Ç¤Ï, ŬÍÑÎã¤È¤·¤ÆÆüËܸìʸ»úÎó¥Ç¡¼¥¿¤«¤é¤ÎÃê½Ð·ë²Ì¤ò¼¨¤·¤¿¤¬, ¤³¤ÎÊýË¡¤Ï, Ǥ°Õ¤Îµ¹æÎó¤ËŬÍѤǤ¤ë¤¿¤á, ñ¸ìñ°Ì¤Ëʬ³ä¤·¤¿Ã±¸ìÎó¤ä·ÁÂÖÁDzòÀϤηë²Ì¤È¤·¤ÆÆÀ¤é¤ì¤¿Ê¸Ë¡ÅªÍ×ÁÇÎó, ¤â¤·¤¯¤Ï, ³ÆÃ±¸ì¤ò¤½¤Î°Ọ̃°À¤ÇÃÖ¤´¹¤¨¤¿°Ọ̃°ÀÏ¢º¿¤Ê¤É, ¼ï¡¹¤Î±þÍѤ¬²Äǽ¤Ç¤¢¤ë¡ù. º£¸å¤Ï, ¼ï¡¹¤ÎŬÍѼ¸³¤ò¹Ô¤¤, ¤µ¤é¤Ë²þÎɤò²Ã¤¨¤Æ¤¤¤¯¤Ä¤â¤ê¤Ç¤¢¤ë. »²¹Íʸ¸¥
(Ê¿À®7ǯ3·î31Æü¼õÉÕ)
(Ê¿À®7ǯ9·î6ÆüºÎÏ¿)
Footnote ¡ù ËÜÏÀʸ¤Ç¤Ï, ʸˡŪ, °Ọ̃Ū¤Ëɽ¸½¤Îñ°Ì¤È¤ß¤Ê¤»¤ëʸ±§Îó¤ò°Õ¼±¤·¤Æ¡Öɽ¸½¡×¤È¸Æ¤Ö. (Return) ¡ù ¤¢¤ëɽ¸½¤ÎÉôʬ¤È¤·¤Æ¤·¤«»ÈÍѤµ¤ì¤Ê¤¤¤è¤¦¤ÊÉôʬŪ¤Êɽ¸½¤Ï Ãê½Ð¤µ¤ì¤Ê¤¤¤¬, ¤½¤Î¤è¤¦¤Êɽ¸½¤Ï, ¸µ¡¹, ¤½¤ì¤ò´Þ¤à¤è¤êÂç¤ ¤Êɽ¸½¤Î°ìÉô¤Ë¤¹¤®¤Ê¤¤¤È¹Í¤¨¤é¤ì¤ë¤«¤é, ²þ¤á¤Æ¼è¤ê½Ð¤¹¤³ ¤È¤Ï¤·¤Ê¤¤. (Return) ¡ù ½¾Íè, ñ¸ìÎó¤Î¾ì¹ç, ·ë²Ì¤«¤é·×»»¤¹¤ëÊýË¡¤¬»È¤ï¤ì¤Æ¤¤¤¿Îã3) ¤¬¤¢¤ë. ¤·¤«¤·, ¤½¤ÎÊýË¡¤Ç¤Ï, ¸¶Ê¸¤Î¤¢¤ë°ìÄêÎΰ褫¤é, ¸ß ¤¤¤ËÉôʬʸ»úÎó¤ò¶¦Í¤¹¤ë¤è¤¦¤ÊÊ£¿ô¤Îʸ»úÎó¤¬Ãê½Ð¤µ¤ì¤Æ¤¤ ¤ë¤È¤, °ú¤²á¤®¤¬À¸¤¸¤ë. Ãê½Ð¤Î½ª¤ï¤Ã¤¿Ãʳ¬¤Ç¤Ï, °ú¤²á ¤®¤ÎÍ̵¤ÎȽÃǤϲ¼Ç½¤Ê¤¿¤á, Àµ³Î¤Ê·×»»¤Ï¤Ç¤¤Ê¤¤. (Return) ¡ù ÈÆÍÑ ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Î³Æ¥ì¥³¡¼¥É¤Ë, ¼¡Ã±¸ìÈÖÃÏ(next pointer) ¤Î¥Õ¥£¡¼¥ë¥É¤òÀߤ±¤ì¤Ð, ¥é¥ó¥À¥à¥¢¥¯¥»¥¹¤Ë¤è¤Ã¤Æ¤¿¤É¤ì¤ë. ¤·¤«¤·, Ä̾ï¥Õ¥¡¥¤¥ë¥µ¥¤¥º¤ÏÂ礤¯, ¥Ç¥£¥¹¥¯¥¢¥¯¥»¥¹²ó¿ô ¤¬ËÄÂç (4 ¾Ï¤Î¼Â¸³Îã¤Ç¤Ï, Á´¥ì¥³¡¼¥É¤Ë1 ²ó¤º¤Ä¥é¥ó¥À¥à¤Ë ¥¢¥¯¥»¥¹¤¹¤ë»þ´Ö¤Ï, 1,000 Ëü²ó ¡ß 10 ms ¡á 30»þ´ÖÄøÅ٤ȿä Äꤵ¤ì¤ë) ¤È¤Ê¤ë. ¤³¤ì¤ËÂФ·¤ÆÏ¢Â³¤·¤¿¥ì¥³¡¼¥É¤Î½èÍý (IO ¥Ð¥Ã¥Õ¥¡¤Î¥µ¥¤¥º¤Ë¤â¤è¤ë¤¬) ¤Ï¹â®¤Ç¤¢¤ë. ¤½¤Î¤è¤¦¤Ë¤¹¤ë¤Ë ¤Ï, ËÜʸ¤Ç½Ò¤Ù¤¿¤è¤¦¤Ë¸¶Ê¸ÈÖÃϽç¤Ë¥½¡¼¥È¤·Ä¾¤¹É¬Íפ¬¤¢¤ë ¤¬, ¤½¤Î¤¿¤á¤Î¥½¡¼¥È»þ´Ö¤Ï, ¥½¡¼¥È¥Õ¥¡¥¤¥ë¤Ë¼¡Ã±¸ì¹áÃϤò õ¤·¤Æ½ñ¤¹þ¤à½èÍý¤ÈƱÅù¤Î»þ´Ö¤Ç¼Â¹Ô¤Ç¤¤ë. °Ê¾å¤«¤é, ¥é ¥ó¥À¥à¥¢¥¯¥»¥¹¤ËÈæ¤Ù¤Æ, ¥Í¥¯¥¹¥È¥µ¡¼¥Á¤ÎÊý¤¬¹â®¤À¤È´üÂÔ ¤µ¤ì¤ë. ¤Ê¤ª, Ʊ¼ï¤ÎºÆ¥½¡¼¥È½èÍý¤Ï, Ï¢º¿·¿¶¦µ¯Ê¸»úÎóÃê½Ð ¤Ç2 ²ó, Î¥»¶·¿¶¦µ¯¤Ç1 ²ó¤Î¹ç·×3 ²óɬÍפȤʤ뤬, ¤¤¤º¤ì¤â, ½çÉÔÆ±¤È¤Ê¤Ã¤¿¥ì¥³¡¼¥ÉÈֹ椬¸µ¤ÎϢ³ÈÖ¹æ¤Ë¤Ê¤ë¤è¤¦¤Ë¥½¡¼ ¥È¤·Ä¾¤¹¤â¤Î¤Ç¤¢¤ê, ñ½ã¤Ç¹â®¤Ë¼Â¹Ô¤Ç¤¤ë (4 ¾Ï¤ÎÎã¤Ç¤Ï, ¥½¡¼¥È 1 ²óÅö¤¿¤ê¿ôʬ¤Ç¤¢¤ë). (Return) ¡ù Ä¹Èø¡¦ ¿¹¤Î¥¢¥ë¥´¥ê¥º¥à¤ò»ÈÍѤ¹¤ëÉôʬ¤Ç, Ëܼ¸³¤Ç¤Ï, Ä̾ï¤Î ¥³¥à¥½¡¼¥È¤ò»ÈÍѤ·¤¿¤¬, ´ÖÀÜ¥¢¥É¥ì¥¹¤Ë¤è¤ë¥½¡¼¥È¤Ç¤¢¤ë¤¿ ¤á»þ´Ö¤¬¤«¤«¤Ã¤¿. ¤³¤ì¤ò¹â®²½¤¹¤ë¤¿¤á¤Ë¤Ï, ÀèÆ¬1 ¡Á2 ʸ»ú ¤ò¥á¥â¥ê¾å¤Ë¼è¤ê½Ð¤·, ľÀÜ¥½¡¼¥È¤·¤¿¸å, Éôʬ¥½¡¼¥È¤ò·«¤ê ÊÖ¤¹¤Ê¤É¤ÎÊýË¡¤¬¹Í¤¨¤é¤ì¤ë. (Return) ¡ù 2 ¼ïÎà¤Îɽ¸½¤ÎÁȤν¸·×¤Ç¤Ï, ÂÀÚ¤ê¤ò¤·¤Ê¤¤ (2 ÅÙ¿ô°Ê¾å¤òÂÐ ¾Ý¤È¤¹¤ë) ¾ì¹ç, Ìó18 Ëü¼ïÎà(±ä¥Ù40 ËüÅÙ¿ô) ¤ÎÎ¥»¶·¿É½¸½ ¤¬Ãê½Ð¤µ¤ì¤¿. ¤³¤³¤Ç¤Ï, ·ë²Ì¤ò¸«¤ä¤¹¤¯¤¹¤ë¤¿¤á, Ãê½Ð¤µ¤ì ¤ë¼ïÎबÌó1Ëü·ï°Ê²¼¤Ë¤Ê¤ë¤è¤¦¤Ë, ñÆÈ½Ð¸½²ó¿ô10 ¤Ç, ÆþÎÏ ¤ÎÂÀÚ¤ê¤ò¤·¤¿¾ì¹ç¤ò¼¨¤¹. (Return) ¡ù¡ù ɽ6 ¤Ç¤Ï¡Ö¥¼¥Í¥é¥ë¡×+ ¡Ö¥â¡¼¥¿¡¼¥¹¡×, ¡Ö¥µ¥ß¥Ã¥È¡×+ ¡ÖÀè¿Ê¹ñ ¼óǾ²ñµÄ¡× ¤Ê¤É¤Î¥Ú¥¢¤«ÉѽФ·¤Æ¤¤¤ë¤¬, ¤³¤ì¤Ï, ËÜÊ¸Ãæ¤Ë¤Ï ¡Ö¥¼¥Í¥é¥ë¡¦ ¥â¡¼¥¿¡¼¥¹¡×, ¡Ö¥µ¥ß¥Ã¥È (Àè¿Ê¹ñ¼óǾ²ñµÄ)¡×¤Ê¤É¤È ¤·¤Æ½Ð¸½¤·¤Æ¤¤¤¿¤¿¤á¤Ç, ÆÉÅÀ¤ò½ü¤¯µ¹æÎà¤ÏÏ¢º¿·¿¶¦µ¯¤Î½¸ ·×¤ÎÂоݤȤ·¤Ê¤«¤Ã¤¿¤¿¤á¤Ç¤¢¤ë. (Return) ¡ù¡ù¡ù Î㤨¤Ð, Ï¢º¿·¿¤ÇÃê½Ð¤·¤¿Ê¸»úÎó¤Î10%¤¬Í¸ú¤Êɽ¸½¤À¤Ã¤¿ ¤È¤¹¤ë¤È, Î¥»¶·¿¤Î¾ì¹ç¤ËÃê½Ð¤µ¤ì¤ëʸ»úÎó¤ÎÁȤÎ͸ú¤Ê¤â¤Î ¤Ï, 0.1 ¤În ¾è (n ¤ÏÍ×ÁǤȤ¹¤ëɽ¸½¤Î¿ô) °Ê²¼¤Ë¸º¾¯¤¹¤ë ¤È¹Í¤¨¤é¤ì¤ë. ¤·¤¿¤¬¤Ã¤Æ, Ï¢º¿·¿¤ËÈæ¤Ù¤ÆÎ¥»¶·¿¤Ç¤Ï¤µ¤é¤Ë, Ãê½Ð¤·¤¿¤¤É½¸½¤ò¤¤¤«¤Ë¹Ê¤ê¹þ¤à¤«¤¬½ÅÍפÊÌäÂê¤È¤Ê¤ë. ¼Â¸³ ¤Ë¤è¤ì¤Ð, Ãê½Ð¤·¤¿¤¤É½¸½¤ò»ú¼ï¤Ë¤è¤Ã¤ÆÀ©Ì󤹤ë¸ú²Ì¤¬Âç¤ ¤¤¤¬, ¤³¤ÎÅÀ¤Ï, ¤µ¤é¤Ëº£¸å¤Î¸¡Æ¤¤¬É¬ÍפǤ¢¤ë. (Return) ¡ù ¶¦µ¯É½¸½¤ÎÃê½Ð¤Ç¤Ï, ½Ð¸½ÉÑÅ٤ι⤤ɽ¸½¤ò¤¤¤«¤Ë¤â¤ì¤Ê¤¯½¦ ¤¤½Ð¤¹¤«¤¬ÌäÂê¤Ç¤¢¤ë. ½Ð¸½¤¹¤ëɽ¸½¤ÎʬÉÛ¤ËÂ礤ÊÊФê¤Î¤Ê ¤¤É¸ËܤǤ¢¤ì¤Ð, ɸËÜÎ̤òÁý²Ã¤µ¤»¤¿¤È¤, ¤½¤ì¤Ë¤Ä¤ì¤Æ½Ð¸½ ÉÑÅ٤ι⤤ɽ¸½¤Î½Ð¸½²ó¿ô¤âÁý²Ã¤¹¤ë¤«¤é, ŬÅö¤ÊÃͤÇÂÀÚ¤ê ¤ò¤·¤Æ¤â¤½¤ì¤é¤òϳ ¤é¤¹¿´ÇۤϾ¯¤Ê¤¤. ¤Ê¤ª, ɽ¸½¤ËÂ礤ÊÊÐ ¤ê¤Î¤¢¤ëɸËܤξì¹ç¤Ï, ¥¸¥ã¥ó¥ë¤´¤È¤Ëʬ¤±¤Æ, ¶¦µ¯É½¸½¤ò¼ý ½¸¤¹¤ëÊý¤¬Å¬ÀڤȸÀ¤¨¤ë. (Return) ¡ù Ãøºî¸¢¥Á¥§¥Ã¥¯¤Î¤¿¤á¤ÎÃøºîʪ¤Î¾È¹ç,°äÅÁ»Ò¹©³Ø¤Ë¤ª¤±¤ëDNA ®º¿¤Î¥Á¥§¥Ã¥¯¤Ê¤É¤Ø¤Î±þÍѤâ´üÂÔ¤µ¤ì¤ë. ¸½ºß¥×¥í¥°¥é¥à¤Î ¥Ñ¥Ã¥±¡¼¥¸²½¤òͽÄꤷ¤Æ¤¤¤ë¤Î¤Ç, ¤´´õ˾¤ÎÊý¤ÏÃø¼Ô¤Þ¤Ç¤´Ï¢ Íí¤¯¤À¤µ¤¤. ({ikehara, shirai}@nttkb.ntt.jp) (Return) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||