DocumentCode :
591779
Title :
Speaker-ensemble hidden Markov modeling for automatic speech recognition
Author :
Guoli Ye ; Mak, Brian
Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
fYear :
2012
fDate :
5-8 Dec. 2012
Firstpage :
6
Lastpage :
10
Abstract :
This paper proposes a new hidden Makov model (HMM) which we call speaker-ensemble HMM (SE-HMM). An SE-HMM is a multi-path HMM in which each path is an HMM constructed from the training data of a different speaker. SE-HMM may be considered a form of template-based acoustic model where speaker-specific acoustic templates are compressed statistically into speaker-specific HMMs. However, one has the flexibility of building SE-HMM at various level of compression: SE-HMM may be built for a triphone state, a triphone, a whole utterance, or other convenient phonetic units. As a result, SE-HMM contains more details than conventional HMM, but is much smaller than common template-based acoustic models. Furthermore, the construction of SE-HMM is simple, and since it is still an HMM, its construction and computation is well supported by common HMM toolkits such as HTK. The proposed SE-HMM was evaluated on Resource Management and Wall Street Journal tasks, and it consistently gives better word recognition results than conventional HMM.
Keywords :
hidden Markov models; speech recognition; SE-HMM; automatic speech recognition; resource management; speaker-ensemble HMM; speaker-ensemble hidden Markov modeling; speaker-specific acoustic templates; template-based acoustic model; triphone; triphone state; wall street journal tasks; Acoustics; Adaptation models; Hidden Markov models; Silicon; Speech; Speech recognition; Training; detailed acoustic modeling; speaker-ensemble acoustic model; template-based automatic speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location :
Kowloon
Print_ISBN :
978-1-4673-2506-6
Electronic_ISBN :
978-1-4673-2505-9
Type :
conf
DOI :
10.1109/ISCSLP.2012.6423532
Filename :
6423532
Link To Document :
بازگشت