• DocumentCode
    3162937
  • Title

    Unseen noise robust speech recognition using adaptive piecewise linear transformation

  • Author

    Chijiiwa, Keigo ; Suzuki, Masayuki ; Minematsu, Nobuaki ; Hirose, Keikichi

  • Author_Institution
    Univ. of Tokyo, Tokyo, Japan
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    4289
  • Lastpage
    4292
  • Abstract
    SPLICE is one of the speech enhancement methods based on feature conversion, which shows a high performance with a relatively small amount of calculation. After modeling noisy speech features as GMM, conversion functions are obtained for individual GMM components. The original SPLICE estimates clean feature vectors as a weighted summation of the converted versions of input vectors. Since the conversion functions are determined and fixed only by using training data, the effectiveness of the original SPLICE will be lower in the case of unseen noisy environments. In this paper, we propose a novel method to adapt the conversion functions to work well in unseen environments. First, to realize adaptive conversion functions, we characterize those functions using their super vectors. Then, we conduct PCA on the super vectors to reduce the number of parameters to be adapted. By representing the super vectors through their PCA-based base functions and weights, we implement an efficient adaptation method of conversion functions, which we call Eigen-SPLICE here after. Evaluation experiments show that Eigen-SPLICE has reduced word error rate by 21.0% relative to the conventional SPLICE, and by 24.1% relative to EMS SPLICE in the test set B of the AURORA-2 task.
  • Keywords
    Gaussian processes; principal component analysis; speech enhancement; speech recognition; vectors; AURORA-2 task; GMM; Gaussian mixture model; PCA; adaptive piecewise linear transformation; base function; eigen-SPLICE; feature conversion function; noisy speech feature modeling; speech enhancement method; unseen noise robust speech recognition; vector; weighted summation; Environmental management; Indexes; Noise measurement; Speech; Speech recognition; Training data; Vectors; Noise robust; Piecewise linear techniques; Principal component analysis; SPLICE; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6288867
  • Filename
    6288867