DocumentCode
353732
Title
Robust out-of-vocabulary rejection for low-complexity speaker independent speech recognition
Author
Broun, C.C. ; Campbell, W.M.
Author_Institution
Human Interface Lab., Motorola Inc., Tempe, AZ, USA
Volume
3
fYear
2000
fDate
2000
Firstpage
1811
Abstract
With the increased use of speech recognition outside of the lab environment, the need for better out-of-vocabulary (OOV) rejection techniques is critical for the continued success of this user interface. Not only must future speech recognition systems accurately reject OOV utterances, but they must also maintain their performance in mismatched (i.e. noisy) conditions. In this paper, we extend our work on low-complexity, high-accuracy speaker independent speech recognition. We present a novel rejection criterion that is shown to be robust in mismatched conditions. This technique continues our emphasis on speech recognition for resource limited applications, by providing a solution that is highly scalable, requiring no additional memory and no significant increase in computation. The technique is based on the use of multiple garbage models (on the order of 100 or more) and a novel ranking method to achieve robust performance. This method allows for a data dependent approach in order to optimize the performance over each class individually. Results are presented for a large database consisting of 166 speakers and 131 classes. Out-of-class rejection is based on 118 out-of-vocabulary phrases and 3 categories of spurious inputs (breath noise, coughs, and lipsmack). Performance is shown to be superior to the approximated optimal Bayes reject rule
Keywords
computational complexity; pattern classification; speech recognition; OOV utterances rejection; breath noise; coughs; data dependent approach; lipsmack; low-complexity; mismatched conditions; multiple garbage models; noisy conditions; out-of-class rejection; out-of-vocabulary rejection; ranking method; robust performance; speaker independent speech recognition; spurious inputs; user interface; Cost function; Humans; Noise robustness; Optimization methods; Pattern recognition; Polynomials; Speech recognition; Statistical distributions; User interfaces; Working environment noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location
Istanbul
ISSN
1520-6149
Print_ISBN
0-7803-6293-4
Type
conf
DOI
10.1109/ICASSP.2000.862106
Filename
862106
Link To Document