• DocumentCode
    3547872
  • Title

    Speaker identification based on sparse subspace model

  • Author

    Longting Xu ; Zhen Yang

  • Author_Institution
    Coll. of Commun. & Inf. Eng., Nanjing Univ. of Posts & Telecommun., Nanjing, China
  • fYear
    2013
  • fDate
    29-31 Aug. 2013
  • Firstpage
    37
  • Lastpage
    41
  • Abstract
    Mel Frequency Cepstrum Coefficient(MFCC) has been proven extremely successful for text-independent speaker identification. We address the speaker identification problem by presenting a novel Sparse Representation-Subspace algorithm. We propose to develop an overcomplete dictionary of each speaker using the Mel filterbank log energies for all the training utterances. We therefore propose to represent learned dictionary as a linear combination of all the log energies, thereby generating a naturally sparse representation, which is the novel subspace of the speaker. Besides, DCT step of MFCC is a fixed matrix, learned dictionary for different speaker is more adaptive. In the identification process, the unknown vectors of Mel filterbank log energies coefficients are projected into each subspace to decide the matching speaker. Experiments have been conducted on the speech database in our anechoic chamber, and a comparison with MFCC based speaker identification algorithms yields a favorable performance index for the proposed algorithm. Different sparsity and dictionary size shows different results.
  • Keywords
    anechoic chambers (acoustic); cepstral analysis; channel bank filters; sparse matrices; speaker recognition; DCT; MFCC; Mel filterbank log energies coefficients; Mel frequency cepstrum coefficient; anechoic chamber; dictionary learning; sparse representation-subspace algorithm; speaker matching; speech database; text-independent speaker identification; Compressed sensing; Dictionaries; Discrete cosine transforms; Feature extraction; Filter banks; Mel frequency cepstral coefficient; Training; MFCC; Mel filterbank log energies; learned dictionary; sparse representation; sparsity; speaker identification; subspace;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications (APCC), 2013 19th Asia-Pacific Conference on
  • Conference_Location
    Denpasar
  • Print_ISBN
    978-1-4673-6048-7
  • Type

    conf

  • DOI
    10.1109/APCC.2013.6765912
  • Filename
    6765912