• DocumentCode
    182909
  • Title

    Subspace analysis of spectral features for speaker recognition

  • Author

    Ling Chen ; Hong Man ; Huading Jia ; Zhiyi Wang ; Lei Wang ; Zili Li

  • Author_Institution
    CS Dept., Southwestern Univ. of Finance & Econ., Chengdu, China
  • fYear
    2014
  • fDate
    19-21 Aug. 2014
  • Firstpage
    98
  • Lastpage
    102
  • Abstract
    A new front-end feature extraction scheme creating so called LDA-projected magnitude spectrum (L-PMS) features is proposed for speaker recognition systems. Mainstream feature extraction schemes usually use filter-bank or linear predictive coding (LPC) in the process of converting high-dimensional speech data into low-dimensional feature vectors, which may lose important discriminative information for speaker recognition tasks. In this work, the new feature extraction scheme takes log of magnitude spectrum of windowed utterance frames. After variance normalization on the spectral features, linear discriminant analysis (LDA) is applied to create discriminatively more powerful features comparing to the conventional mel-frequency cepstral coefficient (MFCC) features. The new feature was evaluated on the TIMIT and NTIMIT corpora, using vector quantization (VQ) speaker model. The Experiments on all 630 subjects in TIMIT and NTIMIT corpora show that the proposed L-PMS features substantially outperform the conventional MFCC features in the sense of identification rate.
  • Keywords
    cepstral analysis; feature extraction; speaker recognition; vector quantisation; L-PMS; LDA-projected magnitude spectrum; MFCC; NTIMIT corpora; VQ speaker model; discriminative information; front-end feature extraction scheme; high-dimensional speech data conversion; identification rate; linear discriminant analysis; low-dimensional feature vectors; mel-frequency cepstral coefficient; speaker recognition systems; spectral features; subspace analysis; variance normalization; vector quantization; windowed utterance frames; Feature extraction; Mel frequency cepstral coefficient; Speaker recognition; Speech; Speech recognition; Training; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2014 11th International Conference on
  • Conference_Location
    Xiamen
  • Print_ISBN
    978-1-4799-5147-5
  • Type

    conf

  • DOI
    10.1109/FSKD.2014.6980814
  • Filename
    6980814