• DocumentCode
    2979152
  • Title

    Joint distributional modeling with cross-correlation based features

  • Author

    Bilmes, Jeff A.

  • Author_Institution
    Int. Comput. Sci. Inst., Berkeley, CA, USA
  • fYear
    1997
  • fDate
    14-17 Dec 1997
  • Firstpage
    148
  • Lastpage
    155
  • Abstract
    In maximum likelihood based speech recognition systems, it is important to accurately estimate the joint distribution of feature vectors given a particular acoustic model. We propose that by modeling the joint distribution of time localized feature vectors and statistics relating those time localized feature vectors to the relevant acoustic context, we can estimate information contained in the feature vector joint distribution without the accompanying theoretical or computational difficulties. We introduce the modcrossgram (MCG), a computational way of estimating short time spectro temporal correlation based statistics that are informative about the feature vector joint distribution. Using the standard hybrid ANN/HMM architecture, we compare a MCG based speech recognition system with a more traditional one on an isolated word speech database. We show that, in the presence of noise, the MCG based system achieves a significant reduction in word error rate over the standard system
  • Keywords
    hidden Markov models; maximum likelihood detection; neural nets; speech recognition; statistical analysis; MCG based speech recognition system; acoustic context; acoustic model; cross correlation based features; feature vector joint distribution; feature vectors; isolated word speech database; joint distributional modeling; maximum likelihood based speech recognition systems; modcrossgram; short time spectro temporal correlation based statistics; standard hybrid ANN/HMM architecture; time localized feature vectors; word error rate; Computer architecture; Context modeling; Distributed computing; Error analysis; Hidden Markov models; Maximum likelihood estimation; Noise reduction; Spatial databases; Speech recognition; Statistical distributions;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on
  • Conference_Location
    Santa Barbara, CA
  • Print_ISBN
    0-7803-3698-4
  • Type

    conf

  • DOI
    10.1109/ASRU.1997.658999
  • Filename
    658999