• DocumentCode
    2280309
  • Title

    Speech recognition of broadcast news for the European Portuguese language

  • Author

    Meinedo, Hugo ; Souto, Nuno ; Net, Joao P.

  • Author_Institution
    L2F - Spoken Language Syst. Lab., INESC ID Lisboa/IST, Lisboa, Portugal
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    319
  • Lastpage
    322
  • Abstract
    This paper describes our work on the development of a large vocabulary continuous speech recognition system applied to a broadcast news task for the European Portuguese language in the scope of the ALERT project. We start by presenting the baseline recogniser AUDIMUS, which was originally developed with a corpus of read newspaper text. This is a hybrid system that uses a combination of phone probabilities generated by several MLPs trained on distinct feature sets. The paper details the modifications introduced in this system, namely in the development of a new language model, the vocabulary and pronunciation lexicon and the training on new data from the ALERT BN corpus currently available. The system trained with this BN corpus achieved 18.4% WER when tested with the F0 focus condition (studio, planed, native, clean), and 35.2% when tested in all focus conditions.
  • Keywords
    hidden Markov models; linguistics; multilayer perceptrons; speech recognition; ALERT BN corpus; ALERT project; AUDIMUS; European Portuguese language; F0 focus condition; MLPs; broadcast news task; feature sets; hybrid system; language model; large vocabulary continuous speech recognition system; phone probabilities; pronunciation lexicon; vocabulary; Audio recording; Broadcasting; Databases; Multimedia systems; Natural languages; Speech recognition; Streaming media; System testing; TV; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on
  • Print_ISBN
    0-7803-7343-X
  • Type

    conf

  • DOI
    10.1109/ASRU.2001.1034651
  • Filename
    1034651