Title :
Speaker normalization based on the generalized time-frequency representation and Mellin transform
Author :
Dongmei, Jiang ; Rongchun, Zhao
Author_Institution :
Dept. of Comput. Sci. & Eng., Northwestern Polytech. Univ., Xian, China
Abstract :
For vocal tract length normalization in speaker-independent speech recognition, a novel feature extraction method is carried out on the generalized time-frequency representation with cone-shaped kernel (CK-GTFR) and Mellin transform. The GTFR is superior to other representations in suppressing cross terms and producing good time and frequency resolution simultaneously. Mellin transform makes the features insensitive to different vocal tract lengths. F-ratio tests show that features in this paper have the highest separation ability compared to the FFT cepstrum or FFT-Mellin cepstrum, and are superior to the Mel cepstrum in most cases
Keywords :
feature extraction; signal representation; speech recognition; time-frequency analysis; transforms; F-ratio tests; FFT cepstrum; FFT-Mellin cepstrum; Mel cepstrum; Mellin transform; cone-shaped kernel; cross terms suppression; feature extraction method; frequency resolution; generalized time-frequency representation; speaker normalization; speaker-independent speech recognition; time resolution; vocal tract length normalization; vocal tract lengths; Cepstrum; Computer science; Feature extraction; Fourier transforms; Kernel; Loudspeakers; Signal resolution; Speech recognition; Testing; Time frequency analysis;
Conference_Titel :
Signal Processing Proceedings, 2000. WCCC-ICSP 2000. 5th International Conference on
Conference_Location :
Beijing
Print_ISBN :
0-7803-5747-7
DOI :
10.1109/ICOSP.2000.891628