• DocumentCode
    3142314
  • Title

    Phoneme strings based machine transliteration

  • Author

    Qin, Ying

  • Author_Institution
    Dept. of Comput. Sci., Beijing Foreign Studies Univ., Beijing, China
  • fYear
    2011
  • fDate
    27-29 Nov. 2011
  • Firstpage
    304
  • Lastpage
    309
  • Abstract
    Transliteration is always used to translate source names with approximate equivalence of pronunciation into target language. Current direct orthographical mapping (DOM) approach does segmentation and alignment on the basis of the single syllable. However it is hard to break down English names into constituent parts according to their corresponding single Chinese characters. This document proposes an approach of segmentation and alignment on the unit of phoneme strings in transliteration between English and Chinese. To lessen the calculation of model training on whole corpus, we split the training data into several pools stochastically and each is used to train a model. The final results of transliteration are arranged according to the decoding probability of each model, called combined model. The combined machine transliteration system between English and Chinese performs remarkably well on the shared task of NEWS2011.
  • Keywords
    language translation; natural language processing; Chinese characters; English names; NEWS2011; approximate pronunciation equivalence; direct orthographical mapping approach; model training; phoneme strings based machine transliteration; source names; Labeling; Conditional Random Fields; combined transliteration system; machine transliteration; phoneme strings;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing andKnowledge Engineering (NLP-KE), 2011 7th International Conference on
  • Conference_Location
    Tokushima
  • Print_ISBN
    978-1-61284-729-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2011.6138214
  • Filename
    6138214