• DocumentCode
    178062
  • Title

    Generalization of temporal filter and linear transformation for robust speech recognition

  • Author

    Duc Hoang Ha Nguyen ; Xiong Xiao ; Eng Siong Chng ; Haizhou Li

  • Author_Institution
    Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    1730
  • Lastpage
    1734
  • Abstract
    Temporal filtering of feature trajectories and linear transformation of feature vectors are two effective ways to compensate the speech features to achieve robust speech recognition in noisy and reverberant environments. In the previous studies, as the two methods are usually applied in sequence, the interaction between the two methods is not optimized. In this paper, we propose a generalized transform which integrates temporal filter and linear transformation into a single process. The new transform parameters are optimized to minimize an approximated Kullback-Leibler divergence between the distribution of the compensated features and the distribution represented by a clean reference model. The proposed method is evaluated on the Aurora-5 clean condition training task. The experiments show that the generalized transform significantly outperforms the simple cascade of temporal filtering and linear transformation. For example, the word accuracy is improved from 81.55% (cascade) to 83.99% (generalized) and from 72.09% (cascade) to 76.04% (generalized) for office and living room environments, respectively, in speaker based feature adaptation scheme.
  • Keywords
    filtering theory; speech recognition; transforms; Aurora-5 clean condition training task; approximated Kullback-Leibler divergence; clean reference model; compensated feature distribution; feature trajectory; feature vectors; generalized transform; linear transformation; noisy environments; reverberant environments; robust speech recognition; speaker based feature adaptation scheme; temporal filter generalization; Acoustics; Robustness; Speech; Speech processing; Speech recognition; Transforms; Vectors; Kullback-Leibler divergence; Robust speech recognition; linear transformation; reverberant speech recognition; temporal filter;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6853894
  • Filename
    6853894