DocumentCode
590633
Title
Soft-clustering technique for training data in Age-and gender-independent speech recognition
Author
Enami, D. ; Faqiang Zhu ; Yamamoto, Koji ; Nakagawa, Sachiko
Author_Institution
Dept. of Comput. Sci. & Eng., Toyohashi Univ. of Technol., Toyohashi, Japan
fYear
2012
fDate
3-6 Dec. 2012
Firstpage
1
Lastpage
4
Abstract
In this paper, we propose approaches for the Gaussian mixture model (GMM) based soft clustering of training data and the GMM- or/and hidden Markov model (HMM)-based cluster selection in age and gender-independent speech recognition. Typically, increasing the number of speaker classes leads to more specific models in speaker-class-dependent speech recognition, and thus better recognition performance. However, the amount of data for each class model is reduced by the increase in the number of classes, which leads to unreliable model parameters. To solve the problem of the reduction of training data, we propose a GMM-based soft clustering method that allows overlap, and a selecting method for selecting a speaker model using a GMM or/and HMM. In an experiment, we obtained a 5.0% absolute gain for word error rate (WER), and a 24.9% gain for the relative WER over an age- and gender-dependent baseline.
Keywords
Gaussian processes; hidden Markov models; learning (artificial intelligence); speech recognition; GMM; Gaussian mixture model; HMM-based cluster selection; WER; age-independent speech recognition; gender-independent speech recognition; hidden Markov model; soft clustering; soft-clustering technique; speaker model; speaker-class-dependent speech recognition; training data reduction; word error rate; Adaptation models; Context modeling; Educational institutions; Hidden Markov models; Lead; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific
Conference_Location
Hollywood, CA
Print_ISBN
978-1-4673-4863-8
Type
conf
Filename
6411780
Link To Document