• DocumentCode
    2424746
  • Title

    Chinese-English transliteration using weighted finite-state transducers

  • Author

    Wei, Pang ; Bo, Xu

  • Author_Institution
    Inst. of Autom., Digital Media Content Technol. Res. Center, Chinese Acad. of Sci., Beijing
  • fYear
    2008
  • fDate
    7-9 July 2008
  • Firstpage
    1328
  • Lastpage
    1333
  • Abstract
    This paper proposes a novel method for Chinese-English transliteration based on multiple models by using weighted finite-state transducers (WFST). WFST provide a unified framework for integrating the various components of a speech-to-speech translation system, such as speech recognition and machine translation, motivated by their flexibility in integrating multiple sources of information and other interesting properties. We built a grapheme-based model, a phoneme-based model, an extended phoneme-based model and so on. Combining those models with unified framework of WFST, we can build a combining transliteration model for Chinese-English. The advantage of this method lies in that we can better account for such behavior by combining those information sources from different model to maximize the use of the data available. Our experiments show that the resulting system outperforms single-model systems with the model directly trained with Chinese-English name pairs for Chinese-English translation.
  • Keywords
    finite state machines; language translation; natural language processing; Chinese-English transliteration; grapheme-based model; phoneme-based model; speech-to-speech translation system; weighted finite-state transducer; Data mining; Degradation; Dictionaries; Error analysis; Information resources; Information retrieval; Natural languages; Speech recognition; Training data; Transducers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Audio, Language and Image Processing, 2008. ICALIP 2008. International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-1723-0
  • Electronic_ISBN
    978-1-4244-1724-7
  • Type

    conf

  • DOI
    10.1109/ICALIP.2008.4590109
  • Filename
    4590109