Title :
Robust out-of-vocabulary rejection for low-complexity speaker independent speech recognition
Author :
Broun, C.C. ; Campbell, W.M.
Author_Institution :
Human Interface Lab., Motorola Inc., Tempe, AZ, USA
Abstract :
With the increased use of speech recognition outside of the lab environment, the need for better out-of-vocabulary (OOV) rejection techniques is critical for the continued success of this user interface. Not only must future speech recognition systems accurately reject OOV utterances, but they must also maintain their performance in mismatched (i.e. noisy) conditions. In this paper, we extend our work on low-complexity, high-accuracy speaker independent speech recognition. We present a novel rejection criterion that is shown to be robust in mismatched conditions. This technique continues our emphasis on speech recognition for resource limited applications, by providing a solution that is highly scalable, requiring no additional memory and no significant increase in computation. The technique is based on the use of multiple garbage models (on the order of 100 or more) and a novel ranking method to achieve robust performance. This method allows for a data dependent approach in order to optimize the performance over each class individually. Results are presented for a large database consisting of 166 speakers and 131 classes. Out-of-class rejection is based on 118 out-of-vocabulary phrases and 3 categories of spurious inputs (breath noise, coughs, and lipsmack). Performance is shown to be superior to the approximated optimal Bayes reject rule
Keywords :
computational complexity; pattern classification; speech recognition; OOV utterances rejection; breath noise; coughs; data dependent approach; lipsmack; low-complexity; mismatched conditions; multiple garbage models; noisy conditions; out-of-class rejection; out-of-vocabulary rejection; ranking method; robust performance; speaker independent speech recognition; spurious inputs; user interface; Cost function; Humans; Noise robustness; Optimization methods; Pattern recognition; Polynomials; Speech recognition; Statistical distributions; User interfaces; Working environment noise;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.862106