• DocumentCode
    3245335
  • Title

    Nonlinear spectral transformations for robust speech recognition

  • Author

    Ikbal, Shajith ; Herman, Hynek ; Bourlard, Hervé

  • Author_Institution
    IDIAP, Martigny, Switzerland
  • fYear
    2003
  • fDate
    30 Nov.-3 Dec. 2003
  • Firstpage
    393
  • Lastpage
    398
  • Abstract
    Recently, a nonlinear transformation of autocorrelation coefficients named phase autocorrelation (PAC) coefficients has been considered for feature extraction. PAC based features show improved robustness to additive noise as a result of two operations, performed during the computation of PAC, namely energy normalization and inverse cosine transformation. In spite of the improved robustness achieved for noisy speech, these two operations lead to some degradation in recognition performance for clean speech. In this paper, we try to alleviate this problem, first by introducing the energy information back into the PAC based features, and second by studying alternatives to the inverse cosine function. Simply appending the frame energy as an additional coefficient in the PAC features has resulted in noticeable improvement in the performance for clean speech. Study of alternatives to the inverse cosine transformation leads to a conclusion that a linear transformation is the best for clean speech, while nonlinear functions help to improve robustness in noise.
  • Keywords
    correlation methods; feature extraction; nonlinear functions; speech recognition; PAC based features energy information; additive noise; clean speech; energy normalization; feature extraction; inverse cosine transformation; linear transformation; noisy speech; nonlinear spectral transformations; phase autocorrelation coefficients; robust speech recognition; Additive noise; Autocorrelation; Degradation; Feature extraction; Noise level; Noise robustness; Phase measurement; Speech analysis; Speech enhancement; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
  • Print_ISBN
    0-7803-7980-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2003.1318473
  • Filename
    1318473