DocumentCode
672363
Title
Models of tone for tonal and non-tonal languages
Author
Metze, Florian ; Sheikh, Zaid A. W. ; Waibel, Alex ; Gehring, Jonas ; Kilgour, Kevin ; Quoc Bao Nguyen ; Van Huy Nguyen
Author_Institution
Language Technol. Inst./InterACT, Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear
2013
fDate
8-12 Dec. 2013
Firstpage
261
Lastpage
266
Abstract
Conventional wisdom in automatic speech recognition asserts that pitch information is not helpful in building speech recognizers for non-tonal languages and contributes only modestly to performance in speech recognizers for tonal languages. To maintain consistency between different systems, pitch is therefore often ignored, trading the slight performance benefits for greater system uniformity/ simplicity. In this paper, we report results that challenge this conventional approach. We present new models of tone that deliver consistent performance improvements for tonal languages (Cantonese, Vietnamese) and even modest improvements for non-tonal languages. Using neural networks for feature integration and fusion, these models achieve significant gains throughout, and provide us with system uniformity and standardization across all languages, tonal and non-tonal.
Keywords
natural language processing; neural nets; speech recognition; automatic speech recognition; feature fusion; feature integration; neural network; nontonal language; pitch information; speech recognizer; tonal language; Context; Feature extraction; Mel frequency cepstral coefficient; Speech; Speech recognition; Training; Acoustic Modeling; Automatic Speech Recognition; Neural Networks; Tonal Features; Tone Modeling;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
Conference_Location
Olomouc
Type
conf
DOI
10.1109/ASRU.2013.6707740
Filename
6707740
Link To Document