Title :
Comparison of Vocal Tract Length Normalization technique applied for clean and noisy speech
Author :
Giurgiu, Mircea ; Kabir, Ahsanul
Author_Institution :
Dept. of Telecommun., Tech. Univ. of Cluj-Napoca, Cluj-Napoca, Romania
Abstract :
Vocal Tract Length Normalization (VTLN) is a well known and widely accepted technique in order to minimize inter-speaker variation and it works particularly well in clean environments. This paper deals with the applicability of VTLN in noisy environments. The question here we ask is whether the performance of current state of art Automatic Speech Recognizer (ASR) can reliably improved by the application of VTLN despite a large mismatch between the operating environments (clean and noisy). Our experiments demonstrate that feature based VTLN is able to improve the performance of ASR in clean speech, and by comparison we present the drawbacks of this technique when applied to noisy speech. Therefore, feature based VTLN in noise should be carefully addressed and combined with other dedicated techniques for environment compensation, such as adaptive filtering or energy normalization. We also point out in this paper the reasons why VTLN feature is not so effective for processing noisy speech in compare to clean speech.
Keywords :
speech recognition; ASR; automatic speech recognizer; clean sppech; feature based VTLN; interspeaker variation; noisy speech; vocal tract length normalization technique; Noise measurement; Signal to noise ratio; Silicon; Speech; Speech processing; Speech recognition; Formant Pattern Model; Frequency Warping; Glimpsing Speech; Speech Resynthesis; Vocal Tract Length Normalization;
Conference_Titel :
Telecommunications and Signal Processing (TSP), 2011 34th International Conference on
Conference_Location :
Budapest
Print_ISBN :
978-1-4577-1410-8
DOI :
10.1109/TSP.2011.6043710