• DocumentCode
    3423552
  • Title

    Robust phone set mapping using decision tree clustering for cross-lingual phone recognition

  • Author

    Sim, Khe Chai ; Li, Haizhou

  • Author_Institution
    Inst. for Infocomm Res., Singapore
  • fYear
    2008
  • fDate
    March 31 2008-April 4 2008
  • Firstpage
    4309
  • Lastpage
    4312
  • Abstract
    Recently, research related to multi-lingual and cross-lingual speech has gained increasing popularity. One of the major problems when dealing with multi-lingual speech data is the mapping of the phone sets between different languages. Phone mapping is useful for cross-lingual speech recognition, cross-lingual pronunciation modelling and mixed language speech synthesis, to name a few. In this paper, an automatic context sensitive phone set mapping method is presented to improve the mapping accuracy. A training methodology that allows the mapping to be learned automatically from parallel time-aligned phone transcriptions is also described. In particular, a decision tree clustering technique is used to tie unseen contexts for robustness. The quality of the proposed mapping method is evaluated on a cross-lingual phone recognition task where the Hungarian and Russian phone recognisers are used to recognise Czech speech and produce Czech phone sequences through phone set mapping. The mapping was trained on only a small amount of data. A consistent relative improvement of 5 - 7% is reported when contextual information is added to phone set mapping.
  • Keywords
    decision trees; natural language processing; speech recognition; speech synthesis; Czech phone sequences; Hungarian phone recognition; Russian phone recognition; automatic context sensitive phone set mapping method; cross-lingual phone recognition; cross-lingual pronunciation modelling; decision tree clustering; mixed language speech synthesis; multi-lingual speech; parallel time-aligned phone transcriptions; robust phone set mapping; Automatic speech recognition; Context modeling; Decision trees; Natural languages; Robust control; Robustness; Speech analysis; Speech recognition; Speech synthesis; Target recognition; context sensitive mapping; cross-lingual; decision tree clustering; phone recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
  • Conference_Location
    Las Vegas, NV
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-1483-3
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2008.4518608
  • Filename
    4518608