Title :
Two-sensor noise robust ASR with missing frames for Aurora2 task
Author :
Demiroglu, Cenk ; Anderson, David V.
Author_Institution :
Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
Abstract :
In a recently proposed system, we have used the missing frames idea for noise robust automatic speech recognition (ASR). The key point behind the missing frames idea is that frames with energies below a certain threshold are considered unreliable frames. We set these frames to a silence floor and treat them as silence frames even if they contain speech signal. Although this causes loss of valuable information such as transitional cues for consonants, we showed that for a small vocabulary task the system substantially decreases the Word Error Rate (WER) at low SNRs. We have also observed that the algorithm decreases the overall computational complexity as opposed to other proposed noise robust systems that typically require considerable computational power. The main drawback of the missing frames system is the difficulty in detecting high energy portions accurately at high noise environments. In this work we propose using a glottal sensor to detect the high energy portions of the acoustic signal. We show that the glottal sensor can detect the high energy speech portions very accurately without adding significant computational complexity. The second contribution of this paper is that we concatenate a speech enhancement algorithm to the front end of the speech recognizer. We show that the enhancement algorithm does not improve the performance of the baseline system much while it decreases the WER substantially for our proposed system.
Keywords :
acoustic noise; computational complexity; electric sensing devices; magnetic sensors; speech enhancement; speech recognition; Aurora2 task; SNR; acoustic signal; computational complexity; glottal electromagnetic sensor; high energy speech portion detection; high noise environment; missing frames system; noise robust automatic speech recognition; signal to noise ratio; speech enhancement algorithm; vocabulary; word error rate; Acoustic noise; Acoustic sensors; Acoustic signal detection; Automatic speech recognition; Computational complexity; Error analysis; Noise robustness; Speech enhancement; Vocabulary; Working environment noise;
Conference_Titel :
Circuits and Systems, 2004. ISCAS '04. Proceedings of the 2004 International Symposium on
Print_ISBN :
0-7803-8251-X
DOI :
10.1109/ISCAS.2004.1329221