DocumentCode :
311038
Title :
Recognition of conversational telephone speech using the JANUS speech engine
Author :
Zeppenfeld, Torsten ; Finke, Michael ; Ries, Klaus ; Westphal, Martin ; Waibel, Alex
Author_Institution :
Interactive Syst. Labs., Carnegie Mellon Univ., Pittsburgh, PA, USA
Volume :
3
fYear :
1997
fDate :
21-24 Apr 1997
Firstpage :
1815
Abstract :
Recognition of conversational speech is one of the most challenging speech recognition tasks to-date. While recognition error rates of 10% or lower can now be reached on speech dictation tasks over vocabularies in excess of 60,000 words, recognition of conversational speech has persistently resisted most attempts at improvements by way of the proven techniques to date. Difficulties arise from shorter words, telephone channel degradation, and highly disfluent and coarticulated speech. In this paper, we describe the application, adaptation, and performance evaluation of our JANUS speech recognition engine to the Switchboard conversational speech recognition task. Through a number of algorithmic improvements, we have been able to reduce error rates from more than 50% word error to 38%, measured on the offical 1996 NIST evaluation test set. Improvements include vocal tract length normalization, polyphonic modeling, label boosting, speaker adaptation with and without confidence measures, and speaking mode dependent pronunciation modeling
Keywords :
natural languages; speech recognition; telephony; JANUS speech engine; Switchboard conversational speech recognition task; algorithmic improvements; coarticulated speech; confidence measures; conversational speech; conversational telephone speech; disfluent speech; error rates; label boosting; polyphonic modeling; speaker adaptation; speaking mode dependent pronunciation modeling; speech recognition; telephone channel degradation; vocal tract length normalization; word error; word length; Boosting; Degradation; Engines; Error analysis; Length measurement; NIST; Speech recognition; Telephony; Testing; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location :
Munich
ISSN :
1520-6149
Print_ISBN :
0-8186-7919-0
Type :
conf
DOI :
10.1109/ICASSP.1997.598889
Filename :
598889
Link To Document :
بازگشت