DocumentCode
2647237
Title
Proposed Myanmar Word Tokenizer based on LIPIDIPIKAR treatise
Author
Thwin, Thein Than ; Win, Aye Thida ; Wai, Phyo Phyo ; Thwin, Mie Mie Su
Author_Institution
Univ. of Comput. Studies, Mandalay, Myanmar
Volume
7
fYear
2010
fDate
16-18 April 2010
Abstract
Natural Language Processing (NLP) based technologies are now becoming important and future intelligent systems will use more of these techniques as the technology is improving explosively. But Asia becomes a dense area in NLP field because of linguistic diversity. Many Asian languages are inadequately supported on computers. Myanmar language is an analytic language but it includes special character like killer, medial, etc.. In English or European languages, all of the syllables are formed by combining the alphabets that represent only consonants and vowels but Myanmar language uses compound syllables that make more difficult to analyze. So we can face difficulties in word sorting. In our proposed system, the condensed form of Myanmar ordinary scripts will be transformed into analyzable elaborated scripts based on LIPIDIPIKAR treatise written by Yaw Min Gyi U Pho Hlaing. These elaborated words can be easily sorted by using this treatise. In our proposed system, complexity of Myanmar condensed words sorting compared with complexity of elaborated words sorting.
Keywords
natural language processing; Asian languages; English; European languages; LIPIDIPIKAR treatise; Myanmar ordinary scripts; Myanmar word tokenizer; intelligent systems; linguistic diversity; natural language processing; Asia; Databases; Diversity reception; Intelligent systems; Natural language processing; Natural languages; Sorting; Speech synthesis; Transducers; Writing; Condensed form; Elaborated form Introduction; NLP; Phonetic token; Unicode;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Engineering and Technology (ICCET), 2010 2nd International Conference on
Conference_Location
Chengdu
Print_ISBN
978-1-4244-6347-3
Type
conf
DOI
10.1109/ICCET.2010.5485313
Filename
5485313
Link To Document