DocumentCode :
740111
Title :
Multistream Recognition of Speech: Dealing With Unknown Unknowns
Author :
Hermansky, Hynek
Author_Institution :
Center for Language & Speech Process., Johns Hopkins Univ., Baltimore, MD, USA
Volume :
101
Issue :
5
fYear :
2013
fDate :
5/1/2013 12:00:00 AM
Firstpage :
1076
Lastpage :
1088
Abstract :
The paper discusses an approach for dealing with unexpected acoustic elements in speech. The approach is motivated by observations of human performance on such problems, which indicate the existence of multiple parallel processing streams in the human speech processing cognitive system, combined with the human ability to know when the correct information is being received. Some earlier relevant engineering approaches in multistream automatic recognition of speech (ASR) that aimed at processing of noisy speech and at dealing with unexpected out-of-vocabulary words are reviewed. The paper also reviews some currently active research in multistream ASR, focusing mainly on feedback-based techniques involving fusion of information between individual processing streams. The difference between the system behavior on its training data and during its operation is proposed as a substitute for the human ability of “knowing when knowing.” Most recent results indicate 9% relative improvement in error rates in phoneme recognition of high signal-to-noise ratio speech and as high as 30% relative improvements in moderate noise.
Keywords :
cognitive systems; data analysis; hearing; learning (artificial intelligence); parallel processing; sensor fusion; speech processing; speech recognition; data analysis; feedback-based techniques; human auditory processing; human performance; human speech processing cognitive system; information fusion; machine recognition paradigm; multistream ASR; multistream automatic recognition of speech; noisy speech processing; parallel processing streams; phoneme recognition; signal-to-noise ratio speech; system behavior; training data; unexpected acoustic elements; unexpected input signals; unexpected out-of-vocabulary words; Acoustic signal processing; Audio systems; Context awareness; Information processing; Noise measurement; Speech processing; Speech recognition; Auditory perception; confidence measures; machine learning; speech recognition; unexpected information;
fLanguage :
English
Journal_Title :
Proceedings of the IEEE
Publisher :
ieee
ISSN :
0018-9219
Type :
jour
DOI :
10.1109/JPROC.2012.2236871
Filename :
6428587
Link To Document :
بازگشت