DocumentCode :
2021087
Title :
Predicting unseen triphones with senones
Author :
Hwang, Mei-Yuh ; Huang, Xuedong ; Alleva, Fileno
Author_Institution :
Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
Volume :
2
fYear :
1993
fDate :
27-30 April 1993
Firstpage :
311
Abstract :
In large-vocabulary speech recognition, there are always new triphones that are not covered in the training data. These unseen triphones are usually represented by corresponding diphones or context-independent monophones. It is proposed that decision-tree-based senones be used to generate needed senonic baseforms for unseen triphones. A decision tree is built for each individual Markov state of each phone, and the leaves of the trees constitute the senone codebook. A Markov state of any triphone traverses the corresponding tree until it reaches a leaf to find the senone it is to be associated with. The DARPA 5000-word peaker-independent Wall Street Journal dictation task is used to evaluate the proposed method. The word error rate is reduced by more than 10% when unseen triphones are modeled by the decision-tree-based senones.<>
Keywords :
Markov processes; dictation; learning (artificial intelligence); speech recognition; trees (mathematics); vocabulary; Markov state; Wall Street Journal dictation; decision-tree-based senones; large-vocabulary speech recognition; senone codebook; training; unseen triphones; word error rate;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location :
Minneapolis, MN, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.1993.319299
Filename :
319299
Link To Document :
بازگشت