Discriminative analysis of distortion sequences in speech recognition

Author

Chang, Pao-Chung ; Chen, Sin-Horng ; Juang, Biing-hwang

Author_Institution

Telecommun. Lab., Minist. of Chung-Li Commun., Taiwan

Volume

1

Issue

3

fYear

1993

fDate

7/1/1993 12:00:00 AM

Firstpage

326

Lastpage

333

Abstract

In a traditional speech recognition system, the distance score between a test token and a reference pattern is obtained by simply averaging the distortion sequence resulted from the matching of the two patterns through a dynamic programming procedure. The final decision is made by choosing the one with the minimal average distance score. If one views the distortion sequence as a form of observed features, a decision rule based on a specific discriminant function designed for the distortion sequence obviously will perform better than that based on the simple average distortion. The authors therefore, suggest a linear discriminant function of the form ▵=Σ_{i1}T w(i)* d(i) to compute the distance score ▵ instead of a direct average ▵=1/T Σ_{i1}T d(i). Several adaptive algorithms are proposed to learn the discriminant weighting function. These include one heuristic method, two methods based on the error propagation algorithm, and one method based on the generalized probabilistic descent algorithm (GPD). They study these methods in a speaker-independent speech recognition task involving utterances of the highly confusible English E-set (b,c,d,e,g,p,t,v,z). The results show that the best performance is obtained by using the GPD-method which achieved a 78.1% accuracy, compared to 67.6% with the traditional unweighted average method. Besides the experimental comparisons, an analytical discussion of various training algorithms is also provided

Keywords

speech recognition; English E-set; adaptive algorithms; decision rule; discriminant weighting function; discriminative analysis; distance score; distortion sequences; dynamic time warping algorithm; error propagation algorithm; generalized probabilistic descent algorithm; heuristic method; linear discriminant function; speaker-independent speech recognition; speech recognition; training algorithms; Adaptive algorithm; Distortion measurement; Dynamic programming; Hidden Markov models; Pattern matching; Pattern recognition; Speech analysis; Speech recognition; System testing; Vectors;

fLanguage

English

Journal_Title

Speech and Audio Processing, IEEE Transactions on

Publisher

ieee

ISSN

1063-6676

Type

jour

DOI

10.1109/89.232616

Filename

232616