• DocumentCode
    542219
  • Title

    Enhanced posteriors bias prediction for robust multi-stream ASR combining voicing and estimate reliabilities

  • Author

    Glotin, Hervé

  • Author_Institution
    ERSS - CNRS, 5 all. Machado; Toulouse Cedex 1 - France
  • Volume
    1
  • fYear
    2002
  • fDate
    13-17 May 2002
  • Abstract
    We discuss the fusion of speech and phoneme estimate reliabilities in a multi-stream Automatic Speech Recognizer (ASR) to improve ASR robustness. The Full Combination approach (FC) proposes to decompose the full-band posterior probability for each phoneme into a reliability weighted sum of stream posteriors´ combinations. Previous studies show that weighting factors in FC should take in account not only speech signal reliability, but also the intrinsic efficiency of subband experts. To control these two variables for each combination of posteriors we derive a new model called “Posteriors Bias Prediction” (PBP) inspired by the Shannon Correction system. We show that FC is a specific type of PBP, and that PBP allows the integration of stream reliability based on of the voicing level R (Correlated with the Signal to Noise Ratio) and the phoneme´s class. Tests on telephonic free digits (Numbers95) under various noise and SNR level demonstrate that PBP- outperforms FC, Jrasta or Spectral Subtraction methods.
  • Keywords
    Adaptation model; Hidden Markov models; Robustness; Signal to noise ratio; Speech processing; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
  • Conference_Location
    Orlando, FL, USA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7402-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2002.5743717
  • Filename
    5743717