مرکز منطقه ای اطلاع رساني علوم و فناوري - Multistream Recognition of Speech: Dealing With Unknown Unknowns

DocumentCode :

740111

Title :

Multistream Recognition of Speech: Dealing With Unknown Unknowns

Author :

Hermansky, Hynek

Author_Institution :

Center for Language & Speech Process., Johns Hopkins Univ., Baltimore, MD, USA

Volume :

101

Issue :

fYear :

2013

fDate :

5/1/2013 12:00:00 AM

Firstpage :

1076

Lastpage :

1088

Abstract :

The paper discusses an approach for dealing with unexpected acoustic elements in speech. The approach is motivated by observations of human performance on such problems, which indicate the existence of multiple parallel processing streams in the human speech processing cognitive system, combined with the human ability to know when the correct information is being received. Some earlier relevant engineering approaches in multistream automatic recognition of speech (ASR) that aimed at processing of noisy speech and at dealing with unexpected out-of-vocabulary words are reviewed. The paper also reviews some currently active research in multistream ASR, focusing mainly on feedback-based techniques involving fusion of information between individual processing streams. The difference between the system behavior on its training data and during its operation is proposed as a substitute for the human ability of “knowing when knowing.” Most recent results indicate 9% relative improvement in error rates in phoneme recognition of high signal-to-noise ratio speech and as high as 30% relative improvements in moderate noise.

Keywords :

cognitive systems; data analysis; hearing; learning (artificial intelligence); parallel processing; sensor fusion; speech processing; speech recognition; data analysis; feedback-based techniques; human auditory processing; human performance; human speech processing cognitive system; information fusion; machine recognition paradigm; multistream ASR; multistream automatic recognition of speech; noisy speech processing; parallel processing streams; phoneme recognition; signal-to-noise ratio speech; system behavior; training data; unexpected acoustic elements; unexpected input signals; unexpected out-of-vocabulary words; Acoustic signal processing; Audio systems; Context awareness; Information processing; Noise measurement; Speech processing; Speech recognition; Auditory perception; confidence measures; machine learning; speech recognition; unexpected information;

fLanguage :

English

Journal_Title :

Proceedings of the IEEE

Publisher :

ieee

ISSN :

0018-9219

Type :

jour

DOI :

10.1109/JPROC.2012.2236871

Filename :

6428587

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=740111