• DocumentCode
    2768583
  • Title

    Predictive linear transforms for noise robust speech recognition

  • Author

    Gales, M.J.F. ; van Dalen, R.C.

  • Author_Institution
    Cambridge Univ., Cambridge
  • fYear
    2007
  • fDate
    9-13 Dec. 2007
  • Firstpage
    59
  • Lastpage
    64
  • Abstract
    It is well known that the addition of background noise alters the correlations between the elements of, for example, the MFCC feature vector. However, standard model-based compensation techniques do not modify the feature-space in which the diagonal covariance matrix Gaussian mixture models are estimated. One solution to this problem, which yields good performance, is joint uncertainty decoding (JUD) with full transforms. Unfortunately, this results in a high computational cost during decoding. This paper contrasts two approaches to approximating full JUD while lowering the computational cost. Both use predictive linear transforms to modify the feature-space: adaptation-based linear transforms, where the model parameters are restricted to be the same as the original clean system; and precision matrix modelling approaches, in particular semi-tied covariance matrices. These predictive transforms are estimated using statistics derived from the full JUD transforms rather than noisy data. The schemes are evaluated on AURORA 2 and a noise-corrupted resource management task.
  • Keywords
    Gaussian processes; covariance matrices; decoding; estimation theory; noise; speech coding; speech recognition; transforms; AURORA 2; Gaussian mixture model; adaptation-based linear transform; background noise; diagonal covariance matrix; joint uncertainty decoding; model-based compensation technique; noise-corrupted resource management task; predictive linear transform; speech recognition; statistical estimation; Background noise; Computational efficiency; Covariance matrix; Decoding; Mel frequency cepstral coefficient; Noise robustness; Predictive models; Speech recognition; Uncertainty; Vectors; Noise robust speech recognition; joint uncertainty decoding; precision matrix modelling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
  • Conference_Location
    Kyoto
  • Print_ISBN
    978-1-4244-1746-9
  • Electronic_ISBN
    978-1-4244-1746-9
  • Type

    conf

  • DOI
    10.1109/ASRU.2007.4430084
  • Filename
    4430084