Title :
Speaker clustering for speech recognition using the parameters characterizing vocal-tract dimensions
Author :
Naito, Masaki ; Deng, Li ; Sagisaka, Yoshinori
Author_Institution :
ATR Interpreting Telephony Res. Labs., Kyoto, Japan
Abstract :
We propose speaker clustering methods based on the vocal-tract-size related articulatory parameters associated with individual speakers. Two parameters characterizing gross vocal-tract dimensions are first derived from formants of speaker-specific Japanese vowels, and are then used to cluster a total of 148 male Japanese speakers. The resultant speaker clusters are found to be significantly different from the speaker clusters obtained by conventional acoustic criteria. Japanese phoneme recognition experiments are carried out using speaker-clustered tied-state HMMs (HMNets) trained for each cluster. Compared with the baseline gender dependent model, 5.7% of recognition error reduction has been achieved based on the clustering method using vocal-tract parameters
Keywords :
hidden Markov models; parameter estimation; speech recognition; HMNet; Japanese phoneme recognition experiments; formants; male Japanese speakers; recognition error reduction; speaker clustering; speaker-clustered tied-state HMM; speaker-specific Japanese vowels; speech recognition; vocal-tract dimensions; vocal-tract-size related articulatory parameters; Adaptation model; Clustering algorithms; Clustering methods; Databases; Hidden Markov models; Loudspeakers; Nonlinear acoustics; Speech recognition; Telecommunication computing; Tree data structures;
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7803-4428-6
DOI :
10.1109/ICASSP.1998.675431