• DocumentCode
    672319
  • Title

    K-component recurrent neural network language models using curriculum learning

  • Author

    Yangyang Shi ; Larson, Matt ; Jonker, Catholijn M.

  • Author_Institution
    Intell. Syst. Dept., Delft Univ. of Technol., Delft, Netherlands
  • fYear
    2013
  • fDate
    8-12 Dec. 2013
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Conventional n-gram language models are known for their limited ability to capture long-distance dependencies and their brittleness with respect to within-domain variations. In this paper, we propose a k-component recurrent neural network language model using curriculum learning (CL-KRNNLM) to address within-domain variations. Based on a Dutch-language corpus, we investigate three methods of curriculum learning that exploit dedicated component models for specific sub-domains. Under an oracle situation in which context information is known during testing, we experimentally test three hypotheses. The first is that domain-dedicated models perform better than general models on their specific domains. The second is that curriculum learning can be used to train recurrent neural network language models (RNNLMs) from general patterns to specific patterns. The third is that curriculum learning, used as an implicit weighting method to adjust the relative contributions of general and specific patterns, outperforms conventional linear interpolation. Under the condition that context information is unknown during testing, the CL-KRNNLM also achieves improvement over conventional RNNLM by 13% relative in terms of word prediction accuracy. Finally, the CL-KRNNLM is tested in an additional experiment involving N-best rescoring on a standard data set. Here, the context domains are created by clustering the training data using Latent Dirichlet Allocation and k-means clustering.
  • Keywords
    learning (artificial intelligence); natural language processing; pattern clustering; recurrent neural nets; CL-KRNNLM; Dutch-language corpus; N-best rescoring; curriculum learning; dedicated component models; domain-dedicated models; k-component recurrent neural network language models; k-means clustering; latent dirichlet allocation; linear interpolation; long-distance dependencies; n-gram language models; oracle situation; within-domain variations; Accuracy; Context; Data models; Interpolation; Recurrent neural networks; Training; Training data; Curriculum Learning; Language Models; Latent Dirichlet Allocation; Recurrent Neural Networks; Socio-situational setting; Topics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
  • Conference_Location
    Olomouc
  • Type

    conf

  • DOI
    10.1109/ASRU.2013.6707696
  • Filename
    6707696