• DocumentCode
    1066747
  • Title

    Static and Dynamic Variance Compensation for Recognition of Reverberant Speech With Dereverberation Preprocessing

  • Author

    Delcroix, Marc ; Nakatani, Tomohiro ; Watanabe, Shinji

  • Author_Institution
    NTT Commun. Sci. Labs., NTT Corp., Kyoto
  • Volume
    17
  • Issue
    2
  • fYear
    2009
  • Firstpage
    324
  • Lastpage
    334
  • Abstract
    The performance of automatic speech recognition is severely degraded in the presence of noise or reverberation. Much research has been undertaken on noise robustness. In contrast, the problem of the recognition of reverberant speech has received far less attention and remains very challenging. In this paper, we use a dereverberation method to reduce reverberation prior to recognition. Such a preprocessor may remove most reverberation effects. However, it often introduces distortion, causing a dynamic mismatch between speech features and the acoustic model used for recognition. Model adaptation could be used to reduce this mismatch. However, conventional model adaptation techniques assume a static mismatch and may therefore not cope well with a dynamic mismatch arising from dereverberation. This paper proposes a novel adaptation scheme that is capable of managing both static and dynamic mismatches. We introduce a parametric model for variance adaptation that includes static and dynamic components in order to realize an appropriate interconnection between dereverberation and a speech recognizer. The model parameters are optimized using adaptive training implemented with the expectation maximization algorithm. An experiment using the proposed method with reverberant speech for a reverberation time of 0.5 s revealed that it was possible to achieve an 80% reduction in the relative error rate compared with the recognition of dereverberated speech (word error rate of 31%), and the final error rate was 5.4%, which was obtained by combining the proposed variance compensation and MLLR adaptation.
  • Keywords
    expectation-maximisation algorithm; reverberation; speech recognition; acoustic model; dereverberation preprocessing; dynamic variance compensation; expectation maximization algorithm; reverberant speech recognition; static variance compensation; time 0.5 s; Acoustic distortion; Adaptation model; Automatic speech recognition; Degradation; Error analysis; Management training; Noise robustness; Parametric statistics; Reverberation; Speech recognition; Dereverberation; model adaptation; robust automatic speech recognition (ASR); variance compensation;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2008.2010214
  • Filename
    4749470