• DocumentCode
    323801
  • Title

    Transcribing broadcast news with the 1997 Abbot System

  • Author

    Cook, Gary ; Robinson, Tony

  • Author_Institution
    Dept. of Eng., Cambridge Univ., UK
  • Volume
    2
  • fYear
    1998
  • fDate
    12-15 May 1998
  • Firstpage
    917
  • Abstract
    Previous DARPA CSR evaluations have focused on the transcription of broadcast news from both television and radio programmes. This is a challenging task because the data includes a variety of speaking styles and channel conditions. This paper describes the development of a connectionist-hidden Markov model (HMM) system, and the enhancements designed to improve the performance on broadcast news data. Both multilayer perceptron (MLP) and recurrent neural network acoustic models have been investigated. We asses the effect of using gender-dependent acoustic models, and the impact on the performance of varying both the number of parameters and the amount of training data used for acoustic modelling. The use of a context-dependent phone models is described, and the effect of the number of context classes is investigated. We also describe a method for incorporating syllable boundary information during search. Results are reported on the 1997 DARPA Hub-4 development test set
  • Keywords
    acoustic signal processing; broadcasting; hidden Markov models; learning (artificial intelligence); multilayer perceptrons; recurrent neural nets; speech recognition; telecommunication computing; 1997 Abbot System; DARPA CSR evaluations; DARPA Hub-4 development test set; HMM system; MLP; acoustic models; broadcast news data; broadcast news transcription; channel conditions; connectionist-hidden Markov model; context-dependent phone models; gender-dependent acoustic models; multilayer perceptron; radio programmes; recurrent neural network; speaking styles; speech recognition system; syllable boundary information; television programmes; training data; Acoustic testing; Context modeling; Decoding; Hidden Markov models; Radio broadcasting; Recurrent neural networks; Speech enhancement; TV broadcasting; Telephony; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
  • Conference_Location
    Seattle, WA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-4428-6
  • Type

    conf

  • DOI
    10.1109/ICASSP.1998.675415
  • Filename
    675415