DocumentCode :
573161
Title :
Application of a locality preserving discriminant analysis approach to ASR
Author :
Tomar, Vikrant Singh ; Rose, Richard C.
Author_Institution :
Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC, Canada
fYear :
2012
fDate :
2-5 July 2012
Firstpage :
103
Lastpage :
107
Abstract :
This paper presents a comparison of three techniques for dimensionally reduction in feature analysis for automatic speech recognition (ASR). All three approaches estimate a linear transformation that is applied to concatenated log spectral features and provide a mechanism for efficient modeling of spectral dynamics in ASR. The goal of the paper is to investigate the effectiveness of a discriminative approach for estimating these feature space transformations which is based on the assumption that speech features lie on a non-linear manifold. This approach is referred to as locality preserving discriminant analysis (LPDA) and is based on the principle of preserving local within-class relationships in this non-linear space while at the same time maximizing separability between classes. This approach was compared to two well known approaches for dimensionality reduction, linear discriminant analysis (LDA) and locality preserving linear projection (LPP), on the Aurora 2 speech in noise task. The LPDA approach was found to provide a significant reduction in WER with respect to the other techniques for most noise types and signal-to-noise ratios (SNRs).
Keywords :
data reduction; speech recognition; Aurora 2 speech; LPDA; SNR; automatic speech recognition; concatenated log spectral features; dimensionality reduction; dimensionally reduction; efficient modeling; feature analysis; feature space transformations; linear discriminant analysis; linear transformation; locality preserving discriminant analysis; locality preserving linear projection; noise types; nonlinear manifold; nonlinear space; separability; signal-to-noise ratios; spectral dynamics; speech features; Hidden Markov models; Kernel; Manifolds; Noise; Speech; Training; Vectors; Graph embedding; dimensionality reduction; feature extraction; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Science, Signal Processing and their Applications (ISSPA), 2012 11th International Conference on
Conference_Location :
Montreal, QC
Print_ISBN :
978-1-4673-0381-1
Electronic_ISBN :
978-1-4673-0380-4
Type :
conf
DOI :
10.1109/ISSPA.2012.6310443
Filename :
6310443
Link To Document :
بازگشت