Title :
Large Margin Gaussian Mixture Modeling for Phonetic Classification and Recognition
Author :
Sha, Fei ; Saul, Lawrence K.
Author_Institution :
Dept. of Comput. & Inf. Sci., Pennsylvania Univ., Philadelphia, PA
Abstract :
We develop a framework for large margin classification by Gaussian mixture models (GMMs). Large margin GMMs have many parallels to support vector machines (SVMs) but use ellipsoids to model classes instead of half-spaces. Model parameters are trained discriminatively to maximize the margin of correct classification, as measured in terms of Mahalanobis distances. The required optimization is convex over the model´s parameter space of positive semidefinite matrices and can be performed efficiently. Large margin GMMs are naturally suited to large problems in multiway classification; we apply them to phonetic classification and recognition on the TIMIT database. On both tasks, we obtain significant improvement over baseline systems trained by maximum likelihood estimation. For the problem of phonetic classification, our results are competitive with other state-of-the-art classifiers, such as hidden conditional random fields
Keywords :
Gaussian processes; maximum likelihood estimation; speech recognition; support vector machines; TIMIT database; large margin Gaussian mixture modeling; maximum likelihood estimation; phonetic classification; phonetic recognition; positive semidefinite matrices; support vector machines; Acoustic measurements; Automatic speech recognition; Databases; Ellipsoids; Information science; Kernel; Maximum likelihood estimation; Pattern recognition; Support vector machine classification; Support vector machines;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location :
Toulouse
Print_ISBN :
1-4244-0469-X
DOI :
10.1109/ICASSP.2006.1660008