DocumentCode
1467916
Title
Direct training of subspace distribution clustering hidden Markov model
Author
Mak, Brian Kan-Wing ; Bocchieri, Enrico
Author_Institution
Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., Clear Water Bay, China
Volume
9
Issue
4
fYear
2001
fDate
5/1/2001 12:00:00 AM
Firstpage
378
Lastpage
387
Abstract
It generally takes a long time and requires a large amount of speech data to train hidden Markov models for a speech recognition task of a reasonably large vocabulary. Previously, we proposed a compact acoustic model called “subspace distribution clustering hidden Markov model” (SDCHMM) with an aim to save some of the training effort. SDCHMMs are derived from tying continuous density hidden Markov models (CDHMMs) at a finer subphonetic level, namely the subspace distributions. Experiments on the Airline Travel Information System (ATIS) task show that SDCHMMs with significantly fewer model parameters-by one to two orders of magnitude-can be converted from CDHMMs with no loss in word accuracy. With such compact acoustic models, one should be able to train SDCHMM directly from significantly less speech data (without intermediate CDHMMs). We devise a direct SDCHMM training algorithm, assuming an a priori knowledge of the subspace distribution tying structure. On the ATIS task, it is found that both a context-independent and a context-dependent speaker-independent 20-stream SDCHMM system trained with 8 min of speech perform as well as their corresponding CDHMM system trained with 105 min and 36 h of speech, respectively
Keywords
hidden Markov models; speech recognition; ATIS; Airline Travel Information System; CDHMM; SDCHMM; compact acoustic model; context-dependent speaker-independent system; context-independent system; continuous density hidden Markov models; direct SDCHMM training algorithm; direct training; hidden Markov model; model parameters; speech data; speech recognition; subphonetic level; subspace distribution clustering HMM; subspace distributions; word accuracy; Associate members; Automatic speech recognition; Computer science; Hidden Markov models; Information systems; Laboratories; Parameter estimation; Speech recognition; Training data; Vocabulary;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.917683
Filename
917683
Link To Document