Title :
Recent advances in broadcast news transcription
Author :
Kim, D.Y. ; Evermann, G. ; Hain, T. ; Mrva, D. ; Tranter, S.E. ; Wang, L. ; Woodland, P.C.
Author_Institution :
Dept. of Eng., Cambridge Univ., UK
fDate :
Nov. 30 2003-Dec. 4 2003
Abstract :
Th paper describes recent advances in the CU-HTK Broadcast News English (BN-E) transcription system and its performance in the DARPA/NIST Rich Transcription 2003 Speech-to-Text (RT-03) evaluation. Heteroscedastic linear discriminant analysis (HLDA) and discriminative training, which were previously developed in the context of the recognition of conversational telephone speech, have been successfully applied to the BN-E task for the first time. A number of new features have also been added. These include gender-dependent (GD) discriminative training and modified discriminative training using lattice regeneration and combination. On the 2003 evaluation set, the system gave an overall word error rate of 10.7% in less than 10 times real time (10/spl times/RT).
Keywords :
lattice theory; learning (artificial intelligence); natural languages; speech recognition; DARPA/NIST Rich Transcription 2003; broadcast news transcription; conversational telephone speech recognition; gender-dependent discriminative training; heteroscedastic linear discriminant analysis; lattice combination; lattice generation; lattice regeneration; speech-to-text; word error rate; Broadcasting; Cepstral analysis; Hidden Markov models; Lattices; Maximum likelihood estimation; NIST; Real time systems; Speech recognition; Telephony; Wideband;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Conference_Location :
St Thomas, VI, USA
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318412