• DocumentCode
    179595
  • Title

    Joint acoustic modeling of triphones and trigraphemes by multi-task learning deep neural networks for low-resource speech recognition

  • Author

    Dongpeng Chen ; Mak, Brian ; Cheung-Chi Leung ; Sivadas, Sunil

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    5592
  • Lastpage
    5596
  • Abstract
    It is well-known in machine learning that multitask learning (MTL) can help improve the generalization performance of singly learning tasks if the tasks being trained in parallel are related, especially when the amount of training data is relatively small. In this paper, we investigate the estimation of triphone acoustic models in parallel with the estimation of trigrapheme acoustic models under the MTL framework using deep neural network (DNN). As triphone modeling and trigrapheme modeling are highly related learning tasks, a better shared internal representation (the hidden layers) can be learned to improve their generalization performance. Experimental evaluation on three low-resource South African languages shows that triphone DNNs trained by the MTL approach perform significantly better than triphone DNNs that are trained by the single-task learning (STL) approach by ~3-13%. The MTL-DNN triphone models also outperform the ROVER result that combines a triphone STL-DNN and a trigrapheme STL-DNN.
  • Keywords
    learning (artificial intelligence); natural language processing; neural nets; speech recognition; MTL-DNN triphone models; South African languages; deep neural networks; joint acoustic modeling; machine learning; multitask learning; single-task learning; speech recognition; trigrapheme STL-DNN; trigrapheme acoustic models; triphone STL-DNN; triphone acoustic models; Acoustics; Joints; Neural networks; Speech; Speech processing; Training; Training data; deep neural networks; multitask learning; trigrapheme modeling; triphone modeling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854673
  • Filename
    6854673