DocumentCode :
2337360
Title :
A comparison of collation algorithm for Myanmar language
Author :
Yuzana ; Tun, Khin Marlar
Author_Institution :
Univ. of Comput. Studies, Yangon
fYear :
2008
fDate :
13-16 Nov. 2008
Firstpage :
538
Lastpage :
543
Abstract :
Myanmar language has no white spaces and word boundary. There is lack of support in Unicode database application such as collation and searching. Powerful collation strategy has necessitated to the all embracing research in the locality of natural language processing. Consequently, we propose a new collation algorithm MyCollate2 extend from MyCollate1 for Myanmar language. This collation algorithm is based on heuristics chart or table. This method foremost slices the syllables of names and then collates them according to the traditional standard Myanmar language dictionary book order. Propose new heuristics chart can work well not only for syllable segmentation but also for collation of words. This algorithm can collate Myanmar names as well as Myanmar words with complex syllable structure such as Pali, Pali loan styles, subscript styles and kinzi styles. This paper tested with Myanmar name, Pali words from Damma books and dictionary words from dictionary book. The experimental result shows that syllable slicing accuracy get 99.55% compare with others and show slicing performance. Collation accuracy gets 95.88% and is better accuracy than previous collation algorithm MyCollate1.
Keywords :
dictionaries; natural language processing; Damma books; MyCollate1; MyCollate2; Myanmar language dictionary; Myanmar name; Pali loan styles; Pali words; collation algorithm; collation strategy; dictionary words; heuristics chart; heuristics table; kinzi styles; natural language processing; subscript styles; syllable segmentation; syllable slicing; unicode database application; Books; Clustering algorithms; Databases; Dictionaries; Information retrieval; Libraries; Natural language processing; Natural languages; Sorting; White spaces;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Information Management, 2008. ICDIM 2008. Third International Conference on
Conference_Location :
London
Print_ISBN :
978-1-4244-2916-5
Electronic_ISBN :
978-1-4244-2917-2
Type :
conf
DOI :
10.1109/ICDIM.2008.4746740
Filename :
4746740
Link To Document :
بازگشت