DocumentCode :
2799609
Title :
Improved voice activity detection using static harmonic features
Author :
Fukuda, Takashi ; Ichikawa, Osamu ; Nishimura, Masafumi
Author_Institution :
IBM Res. - Tokyo, Yamato, Japan
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
4482
Lastpage :
4485
Abstract :
Accurate voice activity detection (VAD) is important for robust automatic speech recognition (ASR) systems. We have proposed a statistical-model-based VAD using the long-term temporal information in speech, which shows good robustness against noise in an automobile environment. For further improvement, this paper describes a new method to exploit harmonic structure information with statistical models. In our approach, local peaks considered to be harmonic structures are extracted, without explicit pitch detection and voiced-unvoiced classification. The proposed method including both long-term temporal and static harmonic features led to considerable improvements under low SNR conditions in our VAD testing. In addition, the word error rate was reduced by 29.1% in a test that included a full ASR system.
Keywords :
acoustic noise; speech processing; speech recognition; statistical analysis; ASR systems; automatic speech recognition systems; automobile environment; harmonic structure information; long term temporal speech information; static harmonic features; statistical model based VAD; statistical models; voice activity detection; Acoustic noise; Automatic speech recognition; Discrete cosine transforms; Information filtering; Information filters; Noise robustness; Power harmonic filters; Signal to noise ratio; Speech enhancement; Working environment noise; Voice activity detection; harmonic structure; long-term temporal information; noise robustness;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495598
Filename :
5495598
Link To Document :
بازگشت