• DocumentCode
    2956871
  • Title

    Speaker Identification using a Microphone Array and a Joint HMM with Speech Spectrum and Angle of Arrival

  • Author

    Stokes, Jack W. ; Platt, John C. ; Basu, Sumit

  • Author_Institution
    Microsoft Res., Redmond, WA
  • fYear
    2006
  • fDate
    9-12 July 2006
  • Firstpage
    1381
  • Lastpage
    1384
  • Abstract
    In this paper, we present a speaker identification algorithm for a microphone array based on a first-order joint hidden Markov model (HMM) where the observations correspond to the angle of arrival of the speech and the speech spectrum. The goal of the research is to investigate whether including angle of arrival information improves the speaker identification error rates compared to an algorithm based on the speech spectrum only. The spectral model consists of a Gaussian mixture model (GMM) using multiple discriminant analysis (MDA) coefficients and the angle model includes a separate histogram for each participant. The convergence time of the joint HMM is improved by estimating the GMM for each of the meeting participants prior to the start of the meeting and initializing each participant´s spectral GMM in the joint HMM to the pretrained parameter values. The performance of the algorithm is analyzed from data collected during live meetings recorded using an eight element, circular microphone array. For meetings where the participants are stationary, the results show significant improvement over a single channel speaker ID algorithms based on spectrum only
  • Keywords
    Gaussian channels; convergence of numerical methods; direction-of-arrival estimation; hidden Markov models; microphone arrays; speaker recognition; GMM; Gaussian mixture model; HMM; MDA coefficient; angle-of-arrival; channel speaker ID algorithm; convergence time; hidden Markov model; microphone array; multiple discriminant analysis; speaker identification; speech spectrum; Convergence; Data analysis; Error analysis; Hidden Markov models; Histograms; Lifting equipment; Microphone arrays; Performance analysis; Signal processing; Speech processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo, 2006 IEEE International Conference on
  • Conference_Location
    Toronto, Ont.
  • Print_ISBN
    1-4244-0366-7
  • Electronic_ISBN
    1-4244-0367-7
  • Type

    conf

  • DOI
    10.1109/ICME.2006.262796
  • Filename
    4036866