DocumentCode
312172
Title
Improved extended HMM composition by incorporating power variance
Author
Minami, Yasuhiro ; Furui, Sadaoki
Author_Institution
NTT Human Interface Labs., Tokyo, Japan
Volume
2
fYear
1996
fDate
3-6 Oct 1996
Firstpage
1109
Abstract
The paper describes a way of improving extended HMM composition that can precisely adapt HMMs to both noisy and distorted speech. To do this, the authors incorporate the variance of power into extended HMM composition using quantization to approximate the Gaussian distribution of the 0th order cepstrum. Consequently, a distribution of noisy speech is approximated in the linear spectral domain as a mixture of log normal distributions. This method is evaluated by a four-digit recognition experiment when the number of digits is known. Two types of noise, computer room noise and car noise, are used and noisy and distorted speech data is made by adding these types of noise to speech data recorded using a boundary microphone. Results show that the proposed method improves recognition rates for noisy and distorted speech compared with their previous method
Keywords
Gaussian distribution; cepstral analysis; hidden Markov models; log normal distribution; quantisation (signal); speech recognition; 0th order cepstrum; Gaussian distribution; boundary microphone; car noise; computer room noise; distorted speech; four-digit recognition experiment; improved extended HMM composition; linear spectral domain; log normal distributions; noisy speech; power variance; quantization; recognition rates; Additive noise; Gaussian distribution; Gaussian noise; Hidden Markov models; Log-normal distribution; Nonlinear distortion; Random variables; Speech enhancement; Vectors; Working environment noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
0-7803-3555-4
Type
conf
DOI
10.1109/ICSLP.1996.607800
Filename
607800
Link To Document