• DocumentCode
    3327651
  • Title

    Application of the modified group delay function to speaker identification and discrimination

  • Author

    Hegde, Rajesh M. ; Murthy, Hema A. ; Rao, Gadde V Ramana

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Madras, India
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    In this paper, we explore new methods by which speakers can be identified and discriminated, using features derived from the Fourier transform phase. The modified group delay feature (MODGDF) which is a parameterized form of the modified group delay function is used as a front end feature in this study. A Gaussian mixture model (GMM) based speaker identification system is built with the MODGDF as the front end feature. The system is tested on both clean (TIMIT) and noisy telephone (NTIMIT) speech. The results obtained are compared with traditional Mel frequency cepstral coefficients (MFCC) which is derived from the Fourier transform magnitude. When both MFCC and MODGDF were combined, the performance improved by about 4% indicating that both phase and magnitude contain complementary information. In an earlier paper (Murthy et al. (2003)), it was shown that the MODGDF does possess phoneme specific characteristics. In this paper we show that the MODGDF has speaker specific properties. We also make an attempt to understand speaker discriminating characteristics of the MODGDF using the nonlinear mapping technique based on Sammon mapping (Sammon (1969)) and find that the MODGDF empirically demonstrates a certain level of linear separability among speakers.
  • Keywords
    Fourier transforms; Gaussian distribution; delay estimation; speaker recognition; Fourier transform; GMM; Gaussian mixture model; MODGDF; NTIMIT; Sammon mapping; TIMIT; clean speech; linear separability; modified group delay feature; modified group delay function; noisy telephone speech; nonlinear mapping; performance; speaker discrimination; speaker identification; speaker specific properties; Application software; Cepstral analysis; Computer science; Delay; Fourier transforms; Laboratories; Mel frequency cepstral coefficient; Speech; System testing; Telephony;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326036
  • Filename
    1326036