• DocumentCode
    1576755
  • Title

    Language models learning for domain-specific natural language user interaction

  • Author

    Bai, Shuanhu ; Huang, Chien-Lin ; Tan, Yeow-Kee ; Ma, Bin

  • Author_Institution
    Social Robot Group, Inst. for Infocomm Res., Singapore, Singapore
  • fYear
    2009
  • Firstpage
    2480
  • Lastpage
    2485
  • Abstract
    Natural language interface is an important research topic in the area of natural language processing (NLP). Natural language interaction with robot could be the most natural and efficient way. In order to build speech enabled human language interface of robots, our research goal is to study the problems in this area and develop technologies that can potentially improve human-robot interaction. In particular, we present a learning method for building domain-specific language models (LM) for natural language user interfaces. This method is aimed to use small amount of domain-specific data as seeds to tap domain-specific resources residing in larger amount of general-domain data with the help of topic modeling technologies. The proposed algorithm first performs topic decomposition (TD) on the combined dataset of domain-specific and general-domain data using probabilistic latent semantic analysis (PLSA). Then it derives weighted domain-specific word n-gram counts with mixture modeling scheme of PLSA. Finally, it uses traditional n-gram modeling approach to construct domain-specific LMs from the domain-specific word n-gram counts. Experimental results show that this approach can outperform both stat-of-the-art methods and traditional supervised learning method. In addition, the semi-supervised learning method can achieve better performance even with very small amount of domain-specific data.
  • Keywords
    human-robot interaction; learning (artificial intelligence); natural language interfaces; natural language processing; domain-specific natural language processing; general-domain data; human-robot interaction; language models learning; n-gram modeling approach; natural language user interfaces; probabilistic latent semantic analysis; semisupervised learning method; topic decomposition; user interaction; weighted domain-specific word n-gram count; Algorithm design and analysis; Domain specific languages; Human robot interaction; Learning systems; Natural language processing; Natural languages; Performance analysis; Speech; Supervised learning; User interfaces;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Robotics and Biomimetics (ROBIO), 2009 IEEE International Conference on
  • Conference_Location
    Guilin
  • Print_ISBN
    978-1-4244-4774-9
  • Electronic_ISBN
    978-1-4244-4775-6
  • Type

    conf

  • DOI
    10.1109/ROBIO.2009.5420442
  • Filename
    5420442