• DocumentCode
    454564
  • Title

    Maximum Conditional Mutual Information Weighted Scoring for Speech Recognition

  • Author

    Omar, Mohamed Kamal ; Ramaswamy, Ganesh N.

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY
  • Volume
    1
  • fYear
    2006
  • fDate
    14-19 May 2006
  • Abstract
    This paper describes a novel approach for extending the prototype Gaussian mixture model used in representing different classes in many recognition or classification systems and its application to large vocabulary automatic speech recognition (ASR). This is achieved by estimating weighting vectors to the log likelihood values due to different elements in the feature vector. This approach estimates the weighting vectors which maximize an estimate of the conditional mutual information between the log likelihood score and a binary random variable representing whether the log likelihood is estimated using the model of the correct label or not. It is shown in the paper that under some assumptions on the conditional probability density function (PDF) of the log likelihood score given this random variable, maximizing the differential entropy of a normalized log likelihood score is an equivalent criterion. This approach allows emphasizing different features, in the acoustic feature vector used in the system, for different hidden Markov model (HMM) states. In this paper, we apply this approach to the RT04 Arabic broadcast news speech recognition task. Compared to the baseline system, 3% relative improvement in the word error rate (WER) is obtained
  • Keywords
    Gaussian processes; hidden Markov models; probability; speech recognition; Gaussian mixture model; HMM; RT04 Arabic broadcast news; binary random variable; differential entropy; hidden Markov model; mutual information weighted scoring; normalized log likelihood score; probability density function; vocabulary automatic speech recognition; word error rate; Automatic speech recognition; Broadcasting; Entropy; Hidden Markov models; Mutual information; Probability density function; Prototypes; Random variables; Speech recognition; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
  • Conference_Location
    Toulouse
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0469-X
  • Type

    conf

  • DOI
    10.1109/ICASSP.2006.1660011
  • Filename
    1660011