DocumentCode
1689138
Title
Analyzing noise robustness of MFCC and GFCC features in speaker identification
Author
Xiaojia Zhao ; DeLiang Wang
Author_Institution
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
fYear
2013
Firstpage
7204
Lastpage
7208
Abstract
Automatic speaker recognition can achieve a high level of performance in matched training and testing conditions. However, such performance drops significantly in mismatched noisy conditions. Recent research indicates that a new speaker feature, gammatone frequency cepstral coefficients (GFCC), exhibits superior noise robustness to commonly used mel-frequency cepstral coefficients (MFCC). To gain a deep understanding of the intrinsic robustness of GFCC relative to MFCC, we design speaker identification experiments to systematically analyze their differences and similarities. This study reveals that the nonlinear rectification accounts for the noise robustness differences primarily. Moreover, this study suggests how to enhance MFCC robustness, and further improve GFCC robustness by adopting a different time-frequency representation.
Keywords
cepstral analysis; signal representation; speaker recognition; time-frequency analysis; GFCC features; MFCC features; automatic speaker recognition; gammatone frequency cepstral coefficients; matched training; mel-frequency cepstral coefficients; nonlinear rectification; speaker identification; superior noise robustness; testing conditions; time-frequency representation; Mel frequency cepstral coefficient; Noise; Noise robustness; Robustness; Speaker recognition; Speech; GFCC; MFCC; noise robustness; speaker features; speaker identification;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location
Vancouver, BC
ISSN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2013.6639061
Filename
6639061
Link To Document