• DocumentCode
    672349
  • Title

    A propagation approach to modelling the joint distributions of clean and corrupted speech in the Mel-Cepstral domain

  • Author

    Fernadez Astudillo, Ramon

  • Author_Institution
    Spoken Language Syst. Lab., INESC-ID-Lisboa, Lisbon, Portugal
  • fYear
    2013
  • fDate
    8-12 Dec. 2013
  • Firstpage
    180
  • Lastpage
    185
  • Abstract
    This paper presents a closed form solution relating the joint distributions of corrupted and clean speech in the short-time Fourier Transform (STFT) and Mel-Frequency Cepstral Coefficient (MFCC) domains. This makes possible a tighter integration of STFT domain speech enhancement and feature and model-compensation techniques for robust automatic speech recognition. The approach directly utilizes the conventional speech distortion model for STFT speech enhancement, allowing for low cost, single pass, causal implementations. Compared to similar uncertainty propagation approaches, it provides the full joint distribution, rather than just the posterior distribution, which provides additional model compensation possibilities. The method is exemplified by deriving an MMSE-MFCC estimator from the propagated joint distribution. It is shown that similar performance to that of STFT uncertainty propagation (STFT-UP) can be obtained on the AURORA4, while deriving the full joint distribution.
  • Keywords
    Fourier transforms; cepstral analysis; speech enhancement; speech recognition; MFCC domain; Mel-cepstral domain; Mel-frequency cepstral coefficient; STFT speech enhancement; STFT uncertainty propagation; clean speech distribution; closed form solution; corrupted speech distribution; model compensation technique; propagated joint distribution; propagation approach; robust automatic speech recognition; short time Fourier transform; speech distortion model; Computational modeling; Hidden Markov models; Joints; Mel frequency cepstral coefficient; Speech; Speech enhancement; Uncertainty; Modified Imputation; Speech Enhancement; Uncertainty Decoding; Uncertainty Propagation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
  • Conference_Location
    Olomouc
  • Type

    conf

  • DOI
    10.1109/ASRU.2013.6707726
  • Filename
    6707726