DocumentCode
2501463
Title
BEST 2009 : Thai word segmentation software contest
Author
Kosawat, Krit ; Boriboon, Monthika ; Chootrakool, Patcharika ; Chotimongkol, Ananlada ; Klaithin, Supon ; Kongyoung, Sarawoot ; Kriengket, Kanyanut ; Phaholphinyo, Sitthaa ; Purodakananda, Sumonmas ; Thanakulwarapas, Tipraporn ; Wutiwiwatchai, Chai
Author_Institution
Human Language Technol. Lab. (HLT), Nat. Sci. & Technol. Dev. Agency (NSTDA), Pathumthani, Thailand
fYear
2009
fDate
20-22 Oct. 2009
Firstpage
83
Lastpage
88
Abstract
This is a non-technical paper describing how and why we organized BEST 2009, the first contest in the series of ldquobenchmark for enhancing the standard of Thai language processingrdquo, which is expected to help accelerate the progress of the natural language processing technology in Thailand by assembling 3 essential components: common standards, resources and researchers. The BEST 2009 : Thai word segmentation software contest is the first shared task on Thai NLP that exercised this assemblage and aimed to find the best algorithms that could correctly divide Thai non-segmented script into words according to the guidelines previously prepared by experts from several research institutes and universities. Thai word-segmented corpora of 5 million words have been developed as a training set, another 600 K as a test set. The evaluation procedure and protocol have been designed. The process and the results of the contest are reported.
Keywords
natural language processing; text analysis; BEST 2009; NLP; Thai word segmentation software contest; natural language processing; text analysis; Acceleration; Assembly; Educational institutions; Guidelines; Natural language processing; Paper technology; Protocols; Software algorithms; Software standards; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing, 2009. SNLP '09. Eighth International Symposium on
Conference_Location
Bangkok
Print_ISBN
978-1-4244-4138-9
Electronic_ISBN
978-1-4244-4139-6
Type
conf
DOI
10.1109/SNLP.2009.5340941
Filename
5340941
Link To Document