• DocumentCode
    353732
  • Title

    Robust out-of-vocabulary rejection for low-complexity speaker independent speech recognition

  • Author

    Broun, C.C. ; Campbell, W.M.

  • Author_Institution
    Human Interface Lab., Motorola Inc., Tempe, AZ, USA
  • Volume
    3
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    1811
  • Abstract
    With the increased use of speech recognition outside of the lab environment, the need for better out-of-vocabulary (OOV) rejection techniques is critical for the continued success of this user interface. Not only must future speech recognition systems accurately reject OOV utterances, but they must also maintain their performance in mismatched (i.e. noisy) conditions. In this paper, we extend our work on low-complexity, high-accuracy speaker independent speech recognition. We present a novel rejection criterion that is shown to be robust in mismatched conditions. This technique continues our emphasis on speech recognition for resource limited applications, by providing a solution that is highly scalable, requiring no additional memory and no significant increase in computation. The technique is based on the use of multiple garbage models (on the order of 100 or more) and a novel ranking method to achieve robust performance. This method allows for a data dependent approach in order to optimize the performance over each class individually. Results are presented for a large database consisting of 166 speakers and 131 classes. Out-of-class rejection is based on 118 out-of-vocabulary phrases and 3 categories of spurious inputs (breath noise, coughs, and lipsmack). Performance is shown to be superior to the approximated optimal Bayes reject rule
  • Keywords
    computational complexity; pattern classification; speech recognition; OOV utterances rejection; breath noise; coughs; data dependent approach; lipsmack; low-complexity; mismatched conditions; multiple garbage models; noisy conditions; out-of-class rejection; out-of-vocabulary rejection; ranking method; robust performance; speaker independent speech recognition; spurious inputs; user interface; Cost function; Humans; Noise robustness; Optimization methods; Pattern recognition; Polynomials; Speech recognition; Statistical distributions; User interfaces; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-6293-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2000.862106
  • Filename
    862106