• DocumentCode
    2647237
  • Title

    Proposed Myanmar Word Tokenizer based on LIPIDIPIKAR treatise

  • Author

    Thwin, Thein Than ; Win, Aye Thida ; Wai, Phyo Phyo ; Thwin, Mie Mie Su

  • Author_Institution
    Univ. of Comput. Studies, Mandalay, Myanmar
  • Volume
    7
  • fYear
    2010
  • fDate
    16-18 April 2010
  • Abstract
    Natural Language Processing (NLP) based technologies are now becoming important and future intelligent systems will use more of these techniques as the technology is improving explosively. But Asia becomes a dense area in NLP field because of linguistic diversity. Many Asian languages are inadequately supported on computers. Myanmar language is an analytic language but it includes special character like killer, medial, etc.. In English or European languages, all of the syllables are formed by combining the alphabets that represent only consonants and vowels but Myanmar language uses compound syllables that make more difficult to analyze. So we can face difficulties in word sorting. In our proposed system, the condensed form of Myanmar ordinary scripts will be transformed into analyzable elaborated scripts based on LIPIDIPIKAR treatise written by Yaw Min Gyi U Pho Hlaing. These elaborated words can be easily sorted by using this treatise. In our proposed system, complexity of Myanmar condensed words sorting compared with complexity of elaborated words sorting.
  • Keywords
    natural language processing; Asian languages; English; European languages; LIPIDIPIKAR treatise; Myanmar ordinary scripts; Myanmar word tokenizer; intelligent systems; linguistic diversity; natural language processing; Asia; Databases; Diversity reception; Intelligent systems; Natural language processing; Natural languages; Sorting; Speech synthesis; Transducers; Writing; Condensed form; Elaborated form Introduction; NLP; Phonetic token; Unicode;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Engineering and Technology (ICCET), 2010 2nd International Conference on
  • Conference_Location
    Chengdu
  • Print_ISBN
    978-1-4244-6347-3
  • Type

    conf

  • DOI
    10.1109/ICCET.2010.5485313
  • Filename
    5485313