• DocumentCode
    2259340
  • Title

    Detection of ambiguous portions of signal corresponding to OOV words or misrecognized portions of input

  • Author

    Lacouture, Roxane ; Normandin, Yves

  • Author_Institution
    CRIM, Montreal, Que., Canada
  • Volume
    4
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    2071
  • Abstract
    One of the key problems for large vocabulary ASR is the detection of unknown or misrecognized portions of the input. The paper presents results obtained using a local rejection algorithm. The algorithm is derived from the two pass recognition algorithm by H. Murveit et al. (1993) and is used to detect misrecognized portions based on the number per frame of active words during the second pass. The hypothesis underlying the algorithm is that recognition on unexpected data, i.e. noise or out of vocabulary (OOV) words, is likely to result in activation of more words, since no word matches the data well; on the other hand, when the match is good, fewer words should be active. The algorithm was tried on part of the WSJ 5K November 1993 test, in which there were no OOV words (3370 words in total) and on the digit strings only Macrophone data (14686 words of which 895 were OOV). The results obtained indicate that our approach is promising, both for the detection of OOV words and misrecognized portions of the input. It may provide the base on which to build tools for dealing with these phenomena. These tools might include dialogue mechanisms based on the list of activated words corresponding to a rejected portion, display mechanisms such as reverse video or rescoring schemes
  • Keywords
    speech processing; speech recognition; word processing; Macrophone data; OOV words; activated words; active words; ambiguous portions; automatic speech recognition; dialogue mechanisms; digit strings; large vocabulary ASR; local rejection algorithm; misrecognized portions; out of vocabulary words; rescoring schemes; reverse video; signal detection; two pass recognition algorithm; unexpected data; Acoustic beams; Acoustic signal detection; Active noise reduction; Automatic speech recognition; Displays; Educational institutions; Lattices; Testing; Viterbi algorithm; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607209
  • Filename
    607209