• DocumentCode
    1467916
  • Title

    Direct training of subspace distribution clustering hidden Markov model

  • Author

    Mak, Brian Kan-Wing ; Bocchieri, Enrico

  • Author_Institution
    Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., Clear Water Bay, China
  • Volume
    9
  • Issue
    4
  • fYear
    2001
  • fDate
    5/1/2001 12:00:00 AM
  • Firstpage
    378
  • Lastpage
    387
  • Abstract
    It generally takes a long time and requires a large amount of speech data to train hidden Markov models for a speech recognition task of a reasonably large vocabulary. Previously, we proposed a compact acoustic model called “subspace distribution clustering hidden Markov model” (SDCHMM) with an aim to save some of the training effort. SDCHMMs are derived from tying continuous density hidden Markov models (CDHMMs) at a finer subphonetic level, namely the subspace distributions. Experiments on the Airline Travel Information System (ATIS) task show that SDCHMMs with significantly fewer model parameters-by one to two orders of magnitude-can be converted from CDHMMs with no loss in word accuracy. With such compact acoustic models, one should be able to train SDCHMM directly from significantly less speech data (without intermediate CDHMMs). We devise a direct SDCHMM training algorithm, assuming an a priori knowledge of the subspace distribution tying structure. On the ATIS task, it is found that both a context-independent and a context-dependent speaker-independent 20-stream SDCHMM system trained with 8 min of speech perform as well as their corresponding CDHMM system trained with 105 min and 36 h of speech, respectively
  • Keywords
    hidden Markov models; speech recognition; ATIS; Airline Travel Information System; CDHMM; SDCHMM; compact acoustic model; context-dependent speaker-independent system; context-independent system; continuous density hidden Markov models; direct SDCHMM training algorithm; direct training; hidden Markov model; model parameters; speech data; speech recognition; subphonetic level; subspace distribution clustering HMM; subspace distributions; word accuracy; Associate members; Automatic speech recognition; Computer science; Hidden Markov models; Information systems; Laboratories; Parameter estimation; Speech recognition; Training data; Vocabulary;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/89.917683
  • Filename
    917683