• DocumentCode
    3529191
  • Title

    Coping with out-of-vocabulary words: Open versus huge vocabulary asr

  • Author

    Gerosa, Matteo ; Federico, Marcello

  • Author_Institution
    FBK-irst - Fondazione Bruno Kessler, Povo
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    4313
  • Lastpage
    4316
  • Abstract
    This paper investigates methods for coping with out-of-vocabulary words in a large vocabulary speech recognition task, namely the automatic transcription of Italian broadcast news. Two alternative ways for augmenting a 64 K(thousand)-word recognition vocabulary and language model are compared: introducing extra words with their phonetic transcription up to 1.2 M (million) words, or extending the language model with so-called graphones, i.e. subword units made of phone-character sequences. Graphones and phonetic transcriptions of words are automatically generated by adapting an off-the-shelf statistical machine translation toolkit. We found that the word-based and graphone based extentions allow both for better recognition performance, with the former performing significantly better than the latter. In addition, the word-based extension approach shows interesting potential even under conditions of little supervision. In fact, by training the grapheme to phoneme translation system with only 2 K manually verified transcriptions, the final word error rate increases by just 3% relative, with respect to starting from a lexicon of 64 K words.
  • Keywords
    language translation; speech processing; speech recognition; statistical analysis; Italian broadcast news; automatic transcription; language model; out-of-vocabulary word; phoneme translation system; phonetic transcription; statistical machine translation toolkit; vocabulary speech recognition; word error rate; word recognition; Art; Automatic speech recognition; Broadcasting; Documentation; Error analysis; Natural languages; Robustness; Speech recognition; Training data; Vocabulary; Automatic Speech Recognition; OOV words; Open-vocabulary speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960583
  • Filename
    4960583