• DocumentCode
    2501463
  • Title

    BEST 2009 : Thai word segmentation software contest

  • Author

    Kosawat, Krit ; Boriboon, Monthika ; Chootrakool, Patcharika ; Chotimongkol, Ananlada ; Klaithin, Supon ; Kongyoung, Sarawoot ; Kriengket, Kanyanut ; Phaholphinyo, Sitthaa ; Purodakananda, Sumonmas ; Thanakulwarapas, Tipraporn ; Wutiwiwatchai, Chai

  • Author_Institution
    Human Language Technol. Lab. (HLT), Nat. Sci. & Technol. Dev. Agency (NSTDA), Pathumthani, Thailand
  • fYear
    2009
  • fDate
    20-22 Oct. 2009
  • Firstpage
    83
  • Lastpage
    88
  • Abstract
    This is a non-technical paper describing how and why we organized BEST 2009, the first contest in the series of ldquobenchmark for enhancing the standard of Thai language processingrdquo, which is expected to help accelerate the progress of the natural language processing technology in Thailand by assembling 3 essential components: common standards, resources and researchers. The BEST 2009 : Thai word segmentation software contest is the first shared task on Thai NLP that exercised this assemblage and aimed to find the best algorithms that could correctly divide Thai non-segmented script into words according to the guidelines previously prepared by experts from several research institutes and universities. Thai word-segmented corpora of 5 million words have been developed as a training set, another 600 K as a test set. The evaluation procedure and protocol have been designed. The process and the results of the contest are reported.
  • Keywords
    natural language processing; text analysis; BEST 2009; NLP; Thai word segmentation software contest; natural language processing; text analysis; Acceleration; Assembly; Educational institutions; Guidelines; Natural language processing; Paper technology; Protocols; Software algorithms; Software standards; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing, 2009. SNLP '09. Eighth International Symposium on
  • Conference_Location
    Bangkok
  • Print_ISBN
    978-1-4244-4138-9
  • Electronic_ISBN
    978-1-4244-4139-6
  • Type

    conf

  • DOI
    10.1109/SNLP.2009.5340941
  • Filename
    5340941