Title :
Strong-sense class-dependent features for statistical recognition
Author :
Omar, Mohamed Kamal ; Hasegawa-Johnson, Mark
Author_Institution :
Dept. of Electr. & Comput. Eng., Illinois Univ., Urbana, IL, USA
fDate :
28 Sept.-1 Oct. 2003
Abstract :
In statistical classification and recognition problems with many classes, it is commonly the case that different classes exhibit wildly different properties. In this case it is unreasonable to expect to be able to summarize these properties by using features designed to represent all the classes. In contrast, features should be designed to represent subsets that exhibit common properties without regard to any class outside this subset. The value of these features for classes outside the subset may be meaningless, or simply undefined. The main problem, due to the statistical nature of the recognizer, is how to compare likelihoods conditioned on different sets of features to decode an input pattern. This paper introduces a class-dependent feature design approach that can be integrated with any probabilistic model. This approach avoids the need of having a conditional probabilistic model for each class and feature type pair, and therefore decreases the computational and storage requirements of using heterogeneous features. This paper presents an algorithm to calculate the class-dependent features that minimize an estimate of the relative entropy between the conditional probabilistic model and the actual conditional probability density function (PDF) of the features of each class. An approach to a hidden Markov model (HMM) automatic speech recognition (ASR) system is applied. A nonlinear class-dependent volume-preserving transformation of the features is used to minimize the objective function. Using this approach, 2% improvement in phoneme recognition accuracy is achieved compared to the baseline system. The approach also shows improvement in recognition accuracy compared to previous class-dependent linear features transformation.
Keywords :
hidden Markov models; maximum likelihood estimation; minimisation; probability; speech recognition; automatic speech recognition system; heterogeneous features; hidden Markov model; input pattern decoding; likelihood comparison; nonlinear volume-preserving transformation; phoneme recognition accuracy; probabilistic model integration; probability density function; relative entropy estimation minimization; statistical recognition; strong-sense class-dependent features; Automatic speech recognition; Decoding; Entropy; Hidden Markov models; Pattern recognition; Performance loss; Probability density function; Robustness; Speech recognition;
Conference_Titel :
Statistical Signal Processing, 2003 IEEE Workshop on
Print_ISBN :
0-7803-7997-7
DOI :
10.1109/SSP.2003.1289454