• DocumentCode
    989802
  • Title

    Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems

  • Author

    Akbacak, Murat ; Hansen, John H L

  • Author_Institution
    Erik Jonsson Sch. of Eng. & Comput. Sci., Texas Univ., Richardson, TX
  • Volume
    15
  • Issue
    2
  • fYear
    2007
  • Firstpage
    465
  • Lastpage
    477
  • Abstract
    Automatic speech recognition systems work reasonably well under clean conditions but become fragile in practical applications involving real-world environments. To date, most approaches dealing with environmental noise in speech systems are based on assumptions concerning the noise, or differences in collecting and training on a specific noise condition, rather than exploring the nature of the noise. As such, speech recognition, speaker ID, or coding systems are typically retrained when new acoustic conditions are to be encountered. In this paper, we propose a new framework entitled Environmental Sniffing to detect, classify, and track acoustic environmental conditions. The first goal of the framework is to seek out detailed information about the environmental characteristics instead of just detecting environmental changes. The second goal is to organize this knowledge in an effective manner to allow smart decisions to direct subsequent speech processing systems. Our current framework uses a number of speech processing modules including a hybrid algorithm with T2-BIC segmentation, Gaussian mixture model/hidden Markov model (GMM/HMM)-based classification and noise language modeling to achieve effective noise knowledge estimation. We define a new information criterion that incorporates the impact of noise into Environmental Sniffing performance. We use an in-vehicle speech and noise environment as a test platform for our evaluations and investigate the integration of Environmental Sniffing for automatic speech recognition (ASR) in this environment. Noise sniffing experiments show that our proposed hybrid algorithm achieves a classification error rate of 25.51%, outperforming our baseline system by 7.08%. The sniffing framework is compared to a ROVER solution for automatic speech recognition (ASR) using different noise conditioned recognizers in terms of word error rate (WER) and CPU usage. Results show that the model matching scheme using the knowledge extr- acted from the audio stream by Environmental Sniffing achieves better performance than a ROVER solution both in accuracy and computation. A relative 11.1% WER improvement is achieved with a relative 75% reduction in CPU resources
  • Keywords
    Gaussian processes; hidden Markov models; speech processing; speech recognition; Gaussian mixture model; T2-BIC segmentation; acoustic environmental conditions; audio stream; automatic speech recognition systems; environmental sniffing; hidden Markov model; hybrid algorithm; in-vehicle speech; model matching scheme; noise knowledge estimation; noise language modeling; robust speech systems; speech processing systems; word error rate; Acoustic noise; Acoustic signal detection; Automatic speech recognition; Error analysis; Gaussian noise; Hidden Markov models; Noise robustness; Speech enhancement; Speech processing; Working environment noise; In-vehicle systems; ROVER; T2-BIC; noise classification; noise modeling; noise tracking; robust speech recognition; speech enhancement;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2006.881694
  • Filename
    4067018