• DocumentCode
    672322
  • Title

    Learning better lexical properties for recurrent OOV words

  • Author

    Long Qin ; Rudnicky, Alex

  • Author_Institution
    M*Modal Inc., Pittsburgh, PA, USA
  • fYear
    2013
  • fDate
    8-12 Dec. 2013
  • Firstpage
    19
  • Lastpage
    24
  • Abstract
    Out-of-vocabulary (OOV) words can appear more than once in a conversation or over a period of time. Such multiple instances of the same OOV word provide valuable information for learning the lexical properties of the word. Therefore, we investigated how to estimate better pronunciation, spelling and part-of-speech (POS) label for recurrent OOV words. We first identified recurrent OOV words from the output of a hybrid decoder by applying a bottom-up clustering approach. Then, multiple instances of the same OOV word were used simultaneously to learn properties of the OOV word. The experimental results showed that the bottom-up clustering approach is very effective at detecting the recurrence of OOV words. Furthermore, by using evidence from multiple instances of the same word, the pronunciation accuracy, recovery rate and POS label accuracy of recurrent OOV words can be substantially improved.
  • Keywords
    learning (artificial intelligence); pattern clustering; speech recognition; POS label accuracy; bottom-up clustering approach; hybrid decoder; lexical properties; out-of-vocabulary words; pronunciation accuracy; recovery rate; recurrent OOV words; speech recognition systems; Accuracy; Acoustics; Context; Feature extraction; Speech; Speech recognition; Testing; OOV word detection; OOV word learning; distributed evidence; recurrent OOV words;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
  • Conference_Location
    Olomouc
  • Type

    conf

  • DOI
    10.1109/ASRU.2013.6707699
  • Filename
    6707699