• DocumentCode
    353537
  • Title

    Speech recognition for a distant moving speaker based on HMM composition and separation

  • Author

    Takiguchi, T. ; Nakamura, S. ; Shikano, K.

  • Author_Institution
    IBM Tokyo Res. Lab., Kanagawa, Japan
  • Volume
    3
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    1403
  • Abstract
    This paper describes a hands-free speech recognition method based on HMM composition and separation for speech contaminated not only by additive noise but also by an acoustic transfer function. The method realizes an improved user interface such that a user is not encumbered by microphone equipment in noisy and reverberant environments. In this approach, an attempt is made to model acoustic transfer functions by means of an ergodic HMM. The states of this HMM correspond to different positions of the sound source. It can represent the positions of the sound sources, even if the speaker moves. The HMM parameters of the acoustic transfer function are estimated by HMM separation. The method is obtained through the reverse of the process of HMM composition, where the model parameters are estimated by maximizing the likelihood of adaptation data uttered from an unknown position. Therefore, measurement of impulse responses is not required. In this paper, we record the speech of a distant moving speaker in real environments. The results of experiments for the speech of a distant moving speaker clarified the effectiveness of HMM composition and separation
  • Keywords
    acoustic noise; hidden Markov models; maximum likelihood estimation; speech recognition; transfer functions; HMM composition; acoustic transfer function; adaptation data; additive noise; contaminated noise; distant moving speaker; ergodic HMM; hands-free speech recognition method; separation; speech recognition; user interface; Acoustic noise; Additive noise; Hidden Markov models; Loudspeakers; Microphones; Speech enhancement; Speech recognition; Transfer functions; User interfaces; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-6293-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2000.861848
  • Filename
    861848