Transcribing broadcast news with the 1997 Abbot System

Author

Cook, Gary ; Robinson, Tony

Author_Institution

Dept. of Eng., Cambridge Univ., UK

Volume

2

fYear

1998

fDate

12-15 May 1998

Firstpage

917

Abstract

Previous DARPA CSR evaluations have focused on the transcription of broadcast news from both television and radio programmes. This is a challenging task because the data includes a variety of speaking styles and channel conditions. This paper describes the development of a connectionist-hidden Markov model (HMM) system, and the enhancements designed to improve the performance on broadcast news data. Both multilayer perceptron (MLP) and recurrent neural network acoustic models have been investigated. We asses the effect of using gender-dependent acoustic models, and the impact on the performance of varying both the number of parameters and the amount of training data used for acoustic modelling. The use of a context-dependent phone models is described, and the effect of the number of context classes is investigated. We also describe a method for incorporating syllable boundary information during search. Results are reported on the 1997 DARPA Hub-4 development test set

Keywords

acoustic signal processing; broadcasting; hidden Markov models; learning (artificial intelligence); multilayer perceptrons; recurrent neural nets; speech recognition; telecommunication computing; 1997 Abbot System; DARPA CSR evaluations; DARPA Hub-4 development test set; HMM system; MLP; acoustic models; broadcast news data; broadcast news transcription; channel conditions; connectionist-hidden Markov model; context-dependent phone models; gender-dependent acoustic models; multilayer perceptron; radio programmes; recurrent neural network; speaking styles; speech recognition system; syllable boundary information; television programmes; training data; Acoustic testing; Context modeling; Decoding; Hidden Markov models; Radio broadcasting; Recurrent neural networks; Speech enhancement; TV broadcasting; Telephony; Training data;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on

Conference_Location

Seattle, WA

ISSN

1520-6149

Print_ISBN

0-7803-4428-6

Type

conf

DOI

10.1109/ICASSP.1998.675415

Filename

675415