• DocumentCode
    310507
  • Title

    Transcribing broadcast news shows

  • Author

    Gauvain, J.-L. ; Adda, G. ; Lamel, L. ; Adda-Decker, M.

  • Author_Institution
    LIMSI, CNRS, Orsay, France
  • Volume
    2
  • fYear
    1997
  • fDate
    21-24 Apr 1997
  • Firstpage
    715
  • Abstract
    While significant improvements have been made in large vocabulary continuous speech recognition of large read-speech corpora such as the ARPA Wall Street Journal-based CSR corpus (WSJ) for American English and the BREF corpus for French, these tasks remain relatively artificial. In this paper we report on our development work in moving from laboratory read speech data to real-world speech data in order to build a system for the new ARPA broadcast news transcription task. The LIMSI Nov96 speech recognizer makes use of continuous density HMMs with Gaussian mixtures for acoustic modeling and n-gram statistics estimated on newspaper texts. The acoustic models are trained on the WSJO/WSJ1, and adapted using MAP estimation with task-specific training data. The overall word error on the Nov96 partitioned evaluation test was 27.1%
  • Keywords
    Gaussian processes; hidden Markov models; maximum likelihood estimation; radio broadcasting; speech recognition; television broadcasting; ARPA broadcast news transcription; Gaussian mixtures; LIMSI Nov96 speech recognizer; MAP estimation; Nov96 partitioned evaluation test; WSJO/WSJ1; acoustic modeling; broadcast news shows; continuous density HMMs; n-gram statistics; newspaper texts; real-world speech data; task-specific training data; word error; Broadcasting; Hidden Markov models; Laboratories; Loudspeakers; Speech enhancement; Speech recognition; Telephony; Text recognition; Training data; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
  • Conference_Location
    Munich
  • ISSN
    1520-6149
  • Print_ISBN
    0-8186-7919-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1997.596002
  • Filename
    596002