• DocumentCode
    257803
  • Title

    Sequence discriminative training for low-rank deep neural networks

  • Author

    Tachioka, Yuuki ; Watanabe, Shinji ; Le Roux, Jonathan ; Hershey, John R.

  • Author_Institution
    Inf. Technol. R&D Center, Mitsubishi Electr. Corp., Kanagawa, Japan
  • fYear
    2014
  • fDate
    3-5 Dec. 2014
  • Firstpage
    572
  • Lastpage
    576
  • Abstract
    Deep neural networks (DNNs) have proven very successful for automatic speech recognition but the number of parameters tends to be large, leading to high computational cost. To reduce the size of a DNN model, low-rank approximations of weight matrices, computed using singular value decomposition (SVD), have previously been applied. Previous studies only focused on clean speech, whereas the additional variability in noisy speech could make model reduction difficult. Thus we investigate the effectiveness of this SVD method on noisy reverberated speech. Furthermore, we combine the low-rank approximation with sequence discriminative training, which further improved the performance of the DNN, even though the original DNN was constructed using a discriminative criterion. We also investigated the effect of the order of application of the low-rank and sequence discriminative training. Our experiments show that low rank approximation is effective for noisy speech and the most effective combination of discriminative training with model reduction is to apply the low rank approximation to the base model first and then to perform discriminative training on the low-rank model. This low-rank discriminatively trained model outperformed the full discriminatively trained model.
  • Keywords
    approximation theory; matrix algebra; neural nets; reverberation; singular value decomposition; speech recognition; DNN model; SVD method; automatic speech recognition; clean speech; computational cost; discriminative criterion; low rank approximation; low-rank approximation; low-rank deep neural network; model reduction; noisy reverberated speech; noisy speech; sequence discriminative training; singular value decomposition; weight matrices; Acoustics; Approximation methods; Hidden Markov models; Neural networks; Speech; Speech recognition; Training; automatic speech recognition; deep neural networks; discriminative training; singular value decomposition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal and Information Processing (GlobalSIP), 2014 IEEE Global Conference on
  • Conference_Location
    Atlanta, GA
  • Type

    conf

  • DOI
    10.1109/GlobalSIP.2014.7032182
  • Filename
    7032182