• DocumentCode
    573161
  • Title

    Application of a locality preserving discriminant analysis approach to ASR

  • Author

    Tomar, Vikrant Singh ; Rose, Richard C.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, QC, Canada
  • fYear
    2012
  • fDate
    2-5 July 2012
  • Firstpage
    103
  • Lastpage
    107
  • Abstract
    This paper presents a comparison of three techniques for dimensionally reduction in feature analysis for automatic speech recognition (ASR). All three approaches estimate a linear transformation that is applied to concatenated log spectral features and provide a mechanism for efficient modeling of spectral dynamics in ASR. The goal of the paper is to investigate the effectiveness of a discriminative approach for estimating these feature space transformations which is based on the assumption that speech features lie on a non-linear manifold. This approach is referred to as locality preserving discriminant analysis (LPDA) and is based on the principle of preserving local within-class relationships in this non-linear space while at the same time maximizing separability between classes. This approach was compared to two well known approaches for dimensionality reduction, linear discriminant analysis (LDA) and locality preserving linear projection (LPP), on the Aurora 2 speech in noise task. The LPDA approach was found to provide a significant reduction in WER with respect to the other techniques for most noise types and signal-to-noise ratios (SNRs).
  • Keywords
    data reduction; speech recognition; Aurora 2 speech; LPDA; SNR; automatic speech recognition; concatenated log spectral features; dimensionality reduction; dimensionally reduction; efficient modeling; feature analysis; feature space transformations; linear discriminant analysis; linear transformation; locality preserving discriminant analysis; locality preserving linear projection; noise types; nonlinear manifold; nonlinear space; separability; signal-to-noise ratios; spectral dynamics; speech features; Hidden Markov models; Kernel; Manifolds; Noise; Speech; Training; Vectors; Graph embedding; dimensionality reduction; feature extraction; speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Science, Signal Processing and their Applications (ISSPA), 2012 11th International Conference on
  • Conference_Location
    Montreal, QC
  • Print_ISBN
    978-1-4673-0381-1
  • Electronic_ISBN
    978-1-4673-0380-4
  • Type

    conf

  • DOI
    10.1109/ISSPA.2012.6310443
  • Filename
    6310443