• DocumentCode
    312172
  • Title

    Improved extended HMM composition by incorporating power variance

  • Author

    Minami, Yasuhiro ; Furui, Sadaoki

  • Author_Institution
    NTT Human Interface Labs., Tokyo, Japan
  • Volume
    2
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    1109
  • Abstract
    The paper describes a way of improving extended HMM composition that can precisely adapt HMMs to both noisy and distorted speech. To do this, the authors incorporate the variance of power into extended HMM composition using quantization to approximate the Gaussian distribution of the 0th order cepstrum. Consequently, a distribution of noisy speech is approximated in the linear spectral domain as a mixture of log normal distributions. This method is evaluated by a four-digit recognition experiment when the number of digits is known. Two types of noise, computer room noise and car noise, are used and noisy and distorted speech data is made by adding these types of noise to speech data recorded using a boundary microphone. Results show that the proposed method improves recognition rates for noisy and distorted speech compared with their previous method
  • Keywords
    Gaussian distribution; cepstral analysis; hidden Markov models; log normal distribution; quantisation (signal); speech recognition; 0th order cepstrum; Gaussian distribution; boundary microphone; car noise; computer room noise; distorted speech; four-digit recognition experiment; improved extended HMM composition; linear spectral domain; log normal distributions; noisy speech; power variance; quantization; recognition rates; Additive noise; Gaussian distribution; Gaussian noise; Hidden Markov models; Log-normal distribution; Nonlinear distortion; Random variables; Speech enhancement; Vectors; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607800
  • Filename
    607800