Title :
Recent improvements to IBM´s speech recognition system for automatic transcription of broadcast news
Author :
Chen, S.S. ; Eide, E.M. ; Gales, M.J.F. ; Gopinath, R.A. ; Kanevsky, D. ; Olsen, P.
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
We describe extensions and improvements to IBM´s system for automatic transcription of broadcast news. The speech recognizer uses a total of 160 hours of acoustic training data, 80 hours more than for the system described in Chen et al. (1998). In addition to improvements obtained in 1997 we made a number of changes and algorithmic enhancements. Among these were changing the acoustic vocabulary, reducing the number of phonemes, insertion of short pauses, mixture models consisting of non-Gaussian components, pronunciation networks, factor analysis (FACILT) and Bayesian information criteria (BIC) applied to choosing the number of components in a Gaussian mixture model. The models were combined in a single system using NIST´s script voting machine known as rover (Fiscus 1997)
Keywords :
Bayes methods; Gaussian processes; speech recognition; BIC; Bayesian information criteria; FACILT; Gaussian mixture model; IBMs speech recognition system; NIST script voting machine; acoustic training; acoustic vocabulary; algorithmic enhancements; automatic transcription; broadcast news; factor analysis; mixture models; nonGaussian components; phonemes; pronunciation networks; rover; short pauses; speech recognizer; Bayesian methods; Broadcasting; Hidden Markov models; Information analysis; Speech analysis; Speech enhancement; Speech recognition; Telephony; Vocabulary; Voting;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
Conference_Location :
Phoenix, AZ
Print_ISBN :
0-7803-5041-3
DOI :
10.1109/ICASSP.1999.758056