DocumentCode
1739538
Title
Speaker normalization based on the generalized time-frequency representation and Mellin transform
Author
Dongmei, Jiang ; Rongchun, Zhao
Author_Institution
Dept. of Comput. Sci. & Eng., Northwestern Polytech. Univ., Xian, China
Volume
2
fYear
2000
fDate
2000
Firstpage
782
Abstract
For vocal tract length normalization in speaker-independent speech recognition, a novel feature extraction method is carried out on the generalized time-frequency representation with cone-shaped kernel (CK-GTFR) and Mellin transform. The GTFR is superior to other representations in suppressing cross terms and producing good time and frequency resolution simultaneously. Mellin transform makes the features insensitive to different vocal tract lengths. F-ratio tests show that features in this paper have the highest separation ability compared to the FFT cepstrum or FFT-Mellin cepstrum, and are superior to the Mel cepstrum in most cases
Keywords
feature extraction; signal representation; speech recognition; time-frequency analysis; transforms; F-ratio tests; FFT cepstrum; FFT-Mellin cepstrum; Mel cepstrum; Mellin transform; cone-shaped kernel; cross terms suppression; feature extraction method; frequency resolution; generalized time-frequency representation; speaker normalization; speaker-independent speech recognition; time resolution; vocal tract length normalization; vocal tract lengths; Cepstrum; Computer science; Feature extraction; Fourier transforms; Kernel; Loudspeakers; Signal resolution; Speech recognition; Testing; Time frequency analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Proceedings, 2000. WCCC-ICSP 2000. 5th International Conference on
Conference_Location
Beijing
Print_ISBN
0-7803-5747-7
Type
conf
DOI
10.1109/ICOSP.2000.891628
Filename
891628
Link To Document