DocumentCode :
1449135
Title :
Hidden-Markov-model-based voice activity detector with high speech detection rate for speech enhancement
Author :
Veisi, H. ; Sameti, Hossein
Author_Institution :
Dept. of Comput. Eng., Sharif Univ. of Technol., Tehran, Iran
Volume :
6
Issue :
1
fYear :
2012
fDate :
2/1/2012 12:00:00 AM
Firstpage :
54
Lastpage :
63
Abstract :
A new voice activity detection (VAD) algorithm with soft decision output in Mel-frequency domain is developed based on hidden Markov model (HMM) and is incorporated in an HMM-based speech enhancement system. The proposed VAD uses a two-state ergodic HMM representing speech presence and speech absence. The states are constructed from noisy speech and noise HMMs used in the speech enhancement system. This composite model provides a robust detection of speech segments in the presence of noise and obviates the need for extra modeling in HMM-based speech enhancement applications. As the main purpose of the proposed VAD is to detect speech segments accurately, a hang-over mechanism is proposed and is applied on the output of the VAD to improve the speech detection rate. The VAD is integrated in the HMM-based speech enhancement system in Mel-frequency spectral (MFS) and cepstral (MFC) domains. The performance of the proposed VAD, the effectiveness of the hang-over mechanism and the performance of the VAD-integrated speech enhancement system are evaluated on four noise types at different SNR levels. The experimental results confirm the superiority of the proposed VAD compared to the reference methods particularly for speech detection rate at the dominant noisy conditions.
Keywords :
cepstral analysis; hidden Markov models; signal detection; speech enhancement; HMM-based speech enhancement applications; HMM-based speech enhancement system; MFC domains; MFS domains; VAD algorithm; VAD-integrated speech enhancement system; dominant noisy conditions; hang-over mechanism; hidden Markov model; hidden-Markov-model-based voice activity detector; mel-frequency cepstral domains; mel-frequency domain; mel-frequency spectral domains; noise HMM; noise types; noisy speech; robust detection; soft decision output; speech absence; speech detection rate; speech presence; speech segments detection; two-state ergodic HMM; voice activity detection algorithm;
fLanguage :
English
Journal_Title :
Signal Processing, IET
Publisher :
iet
ISSN :
1751-9675
Type :
jour
DOI :
10.1049/iet-spr.2010.0282
Filename :
6152271
Link To Document :
بازگشت