DocumentCode
573161
Title
Application of a locality preserving discriminant analysis approach to ASR
Author
Tomar, Vikrant Singh ; Rose, Richard C.
Author_Institution
Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC, Canada
fYear
2012
fDate
2-5 July 2012
Firstpage
103
Lastpage
107
Abstract
This paper presents a comparison of three techniques for dimensionally reduction in feature analysis for automatic speech recognition (ASR). All three approaches estimate a linear transformation that is applied to concatenated log spectral features and provide a mechanism for efficient modeling of spectral dynamics in ASR. The goal of the paper is to investigate the effectiveness of a discriminative approach for estimating these feature space transformations which is based on the assumption that speech features lie on a non-linear manifold. This approach is referred to as locality preserving discriminant analysis (LPDA) and is based on the principle of preserving local within-class relationships in this non-linear space while at the same time maximizing separability between classes. This approach was compared to two well known approaches for dimensionality reduction, linear discriminant analysis (LDA) and locality preserving linear projection (LPP), on the Aurora 2 speech in noise task. The LPDA approach was found to provide a significant reduction in WER with respect to the other techniques for most noise types and signal-to-noise ratios (SNRs).
Keywords
data reduction; speech recognition; Aurora 2 speech; LPDA; SNR; automatic speech recognition; concatenated log spectral features; dimensionality reduction; dimensionally reduction; efficient modeling; feature analysis; feature space transformations; linear discriminant analysis; linear transformation; locality preserving discriminant analysis; locality preserving linear projection; noise types; nonlinear manifold; nonlinear space; separability; signal-to-noise ratios; spectral dynamics; speech features; Hidden Markov models; Kernel; Manifolds; Noise; Speech; Training; Vectors; Graph embedding; dimensionality reduction; feature extraction; speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Science, Signal Processing and their Applications (ISSPA), 2012 11th International Conference on
Conference_Location
Montreal, QC
Print_ISBN
978-1-4673-0381-1
Electronic_ISBN
978-1-4673-0380-4
Type
conf
DOI
10.1109/ISSPA.2012.6310443
Filename
6310443
Link To Document