DocumentCode :
353510
Title :
A new phonetic tied-mixture model for efficient decoding
Author :
Lee, Akinobu ; Kawahara, Tatsuya ; Takeda, Kazuya ; Shikano, Kiyohiro
Author_Institution :
Kyoto Univ., Japan
Volume :
3
fYear :
2000
fDate :
2000
Firstpage :
1269
Abstract :
A phonetic tied-mixture (PTM) model for efficient large vocabulary continuous speech recognition is presented. It is synthesized from context-independent phone models with 64 mixture components per state by assigning different mixture weights according to the shared states of triphones. Mixtures are then re-estimated for optimization. The model achieves a word error rate of 7.0% with a 20000-word dictation of newspaper corpus, which is comparable to the best figure by the triphone of much higher resolutions. Compared with conventional PTMs that share Gaussians by all states, the proposed model is easily trained and reliably estimated. Furthermore, the model enables the decoder to perform efficient Gaussian pruning. It is found out that computing only two out of 64 components does not cause any loss of accuracy. Several methods for the pruning are proposed and compared, and the best one reduced the computation to about 20%
Keywords :
Gaussian distribution; decoding; optimisation; speech coding; speech recognition; Gaussian pruning; PTM; context-independent phone models; efficient decoding; large vocabulary continuous speech recognition; mixture weights; optimization; phonetic tied-mixture model; triphones; word error rate; Context modeling; Decoding; Error analysis; Gaussian distribution; Gaussian processes; Hidden Markov models; Speech recognition; Speech synthesis; State estimation; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
ISSN :
1520-6149
Print_ISBN :
0-7803-6293-4
Type :
conf
DOI :
10.1109/ICASSP.2000.861808
Filename :
861808
Link To Document :
بازگشت