Application of a locality preserving discriminant analysis approach to ASR

Author

Tomar, Vikrant Singh ; Rose, Richard C.

Author_Institution

Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC, Canada

fYear

2012

fDate

2-5 July 2012

Firstpage

103

Lastpage

107

Abstract

This paper presents a comparison of three techniques for dimensionally reduction in feature analysis for automatic speech recognition (ASR). All three approaches estimate a linear transformation that is applied to concatenated log spectral features and provide a mechanism for efficient modeling of spectral dynamics in ASR. The goal of the paper is to investigate the effectiveness of a discriminative approach for estimating these feature space transformations which is based on the assumption that speech features lie on a non-linear manifold. This approach is referred to as locality preserving discriminant analysis (LPDA) and is based on the principle of preserving local within-class relationships in this non-linear space while at the same time maximizing separability between classes. This approach was compared to two well known approaches for dimensionality reduction, linear discriminant analysis (LDA) and locality preserving linear projection (LPP), on the Aurora 2 speech in noise task. The LPDA approach was found to provide a significant reduction in WER with respect to the other techniques for most noise types and signal-to-noise ratios (SNRs).

Keywords

data reduction; speech recognition; Aurora 2 speech; LPDA; SNR; automatic speech recognition; concatenated log spectral features; dimensionality reduction; dimensionally reduction; efficient modeling; feature analysis; feature space transformations; linear discriminant analysis; linear transformation; locality preserving discriminant analysis; locality preserving linear projection; noise types; nonlinear manifold; nonlinear space; separability; signal-to-noise ratios; spectral dynamics; speech features; Hidden Markov models; Kernel; Manifolds; Noise; Speech; Training; Vectors; Graph embedding; dimensionality reduction; feature extraction; speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Information Science, Signal Processing and their Applications (ISSPA), 2012 11th International Conference on

Conference_Location

Montreal, QC

Print_ISBN

978-1-4673-0381-1

Electronic_ISBN

978-1-4673-0380-4

Type

conf

DOI

10.1109/ISSPA.2012.6310443

Filename

6310443