Combining evidence from residual phase and MFCC features for speaker recognition

Author

Murty, K. Sri Rama ; Yegnanarayana, B.

Author_Institution

Dept. of Comput. Sci. & Eng., Indian Inst. of Technol.-Madras, Chennai, India

Volume

13

Issue

1

fYear

2006

Firstpage

52

Lastpage

55

Abstract

The objective of this letter is to demonstrate the complementary nature of speaker-specific information present in the residual phase in comparison with the information present in the conventional mel-frequency cepstral coefficients (MFCCs). The residual phase is derived from speech signal by linear prediction analysis. Speaker recognition studies are conducted on the NIST-2003 database using the proposed residual phase and the existing MFCC features. The speaker recognition system based on the residual phase gives an equal error rate (EER) of 22%, and the system using the MFCC features gives an EER of 14%. By combining the evidence from both the residual phase and the MFCC features, an EER of 10.5% is obtained, indicating that speaker-specific excitation information is present in the residual phase. This information is useful since it is complementary to that of MFCCs.

Keywords

cepstral analysis; error statistics; neural nets; speaker recognition; speech processing; EER; MFCC; NIST-2003 database; autoassociative neural network; equal error rate; linear prediction analysis; mel-frequency cepstral coefficient; residual phase; speaker recognition studies; speech signal; Cepstral analysis; Data mining; Error analysis; Helium; Mel frequency cepstral coefficient; Neural networks; Signal analysis; Spatial databases; Speaker recognition; Speech analysis; Autoassociative neural network; glottal closure instant; linear prediction (LP) residual; residual phase; speaker verification;

fLanguage

English

Journal_Title

Signal Processing Letters, IEEE

Publisher

ieee

ISSN

1070-9908

Type

jour

DOI

10.1109/LSP.2005.860538

Filename

1561210