Title :
New features in the CU-HTK system for transcription of conversational telephone speech
Author :
Hain, T. ; Woodland, P.C. ; Evermann, G. ; Povey, D.
Author_Institution :
Dept. of Eng., Cambridge Univ., UK
Abstract :
Discusses new features integrated into the Cambridge University HTK (CU-HTK) system for the transcription of conversational telephone speech. Major improvements have been achieved by the use of maximum mutual information estimation in training as well as maximum likelihood estimation; the use of a full variance transform for adaptation; the inclusion of unigram pronunciation probabilities; and word-level posterior probability estimation using confusion networks for use in minimum word error rate decoding, confidence score estimation and system combination. Improvements are demonstrated via performance on the NIST March 2000 evaluation of English conversational telephone speech transcription (Hub5E). In this evaluation the CU-HTK system gave an overall word error rate of 25.4%, which was the best performance by a statistically significant margin
Keywords :
hidden Markov models; maximum likelihood estimation; probability; speech recognition; telephony; CU-HTK system; Cambridge University HTK system; Hub5E evaluation; NIST March 2000 evaluation; confidence score estimation; confusion networks; conversational telephone speech; full variance transform; maximum likelihood estimation; maximum mutual information estimation; minimum word error rate decoding; transcription; unigram pronunciation probabilities; word-level posterior probability estimation; Cepstral analysis; Error analysis; Hidden Markov models; Maximum likelihood decoding; Maximum likelihood estimation; Mutual information; Speech; Telephony; Training data; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
Print_ISBN :
0-7803-7041-4
DOI :
10.1109/ICASSP.2001.940766