Title :
Progress in example based automatic speech recognition
Author :
Demuynck, Kris ; Seppi, Dino ; Van hamme, Hugo ; Van Compernolle, Dirk
Author_Institution :
Dept. of Electr. Eng., Katholieke Univ. Leuven, Leuven, Belgium
Abstract :
In this paper we present a number of improvements that were recently made to the template based speech recognition system developed at ESAT. Combining these improvements resulted in a decrease in word error rate from 9.6% to 8.2% on the Nov92, 20k trigram, Wall Street Journal task. The improvements are along different lines. Apart from the time warping already applied within the DTW, it was found beneficial to apply additional length compensation on the template score. The single best score was replaced by a weighted k-NN average, while maintaining natural successor information as an ensemble cost. The local geometry of the acoustic space is now taken into account by assigning a diagonal covariance matrix to each input frame. Context sensitivity of short templates is increased by taking cross boundary scores into account for sorting the N best templates. Furthermore boundaries on the template segmentations may be relaxed. Finally context dependent word templates are now being used for short words. Several other variants that were not retained in the final system are discussed as well.
Keywords :
covariance matrices; speech recognition; ESAT; automatic speech recognition; diagonal covariance matrix; local geometry; natural successor information; Context; Databases; Decoding; Hidden Markov models; Speech; Speech recognition; Viterbi algorithm; DTW; Example Based Recognition; Speech Recognition; Template Based Recognition; k Nearest Neighbours;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947402