DocumentCode :
1511074
Title :
Speaker and Noise Factorization for Robust Speech Recognition
Author :
Wang, Yongqiang ; Gales, Mark J F
Author_Institution :
Eng. Dept., Cambridge Univ., Cambridge, UK
Volume :
20
Issue :
7
fYear :
2012
Firstpage :
2149
Lastpage :
2158
Abstract :
Speech recognition systems need to operate in a wide range of conditions. Thus they should be robust to extrinsic variability caused by various acoustic factors, for example speaker differences, transmission channel and background noise. For many scenarios, multiple factors simultaneously impact the underlying “clean” speech signal. This paper examines techniques to handle both speaker and background noise differences. An acoustic factorization approach is adopted. Here, separate transforms are assigned to represent the speaker [maximum-likelihood linear regression (MLLR)], and noise and channel [model-based vector Taylor series (VTS)] factors. This is a highly flexible framework compared to the standard approaches of modeling the combined impact of both speaker and noise factors. For example factorization allows the speaker characteristics obtained in one noise condition to be applied to a different environment. To obtain this factorization modified versions of MLLR and VTS training and application are derived. The proposed scheme is evaluated for both adaptation and factorization on the AURORA4 data.
Keywords :
regression analysis; signal denoising; speech recognition; AURORA4 data; acoustic factorization; background noise; maximum-likelihood linear regression; model-based vector Taylor series; noise factorization; speaker factorization; speech recognition; transmission channel; Acoustics; Adaptation models; Noise; Robustness; Speech; Speech recognition; Transforms; Acoustic factorization; noise robustness; speaker adaptation; vector Taylor series (VTS);
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2012.2198059
Filename :
6196183
Link To Document :
بازگشت