• DocumentCode
    3585012
  • Title

    Training a statistical surface realiser from automatic slot labelling

  • Author

    Cuayahuitl, Heriberto ; Dethlefs, Nina ; Hastie, Helen ; Xingkun Liu

  • Author_Institution
    Sch. of Math. & Comput. Sci., Heriot-Watt Univ., Edinburgh, UK
  • fYear
    2014
  • Firstpage
    112
  • Lastpage
    117
  • Abstract
    Training a statistical surface realiser typically relies on labelled training data or parallel data sets, such as corpora of paraphrases. The procedure for obtaining such data for new domains is not only time-consuming, but it also restricts the incorporation of new semantic slots during an interaction, i.e. using an online learning scenario for automatically extended domains. Here, we present an alternative approach to statistical surface realisation from unlabelled data through automatic semantic slot labelling. The essence of our algorithm is to cluster clauses based on a similarity function that combines lexical and semantic information. Annotations need to be reliable enough to be utilised within a spoken dialogue system. We compare different similarity functions and evaluate our surface realiser-trained from unlabelled data-in a human rating study. Results confirm that a surface realiser trained from automatic slot labels can lead to outputs of comparable quality to outputs trained from human-labelled inputs.
  • Keywords
    data handling; interactive systems; learning (artificial intelligence); pattern clustering; statistical analysis; automatic semantic slot labelling; clause clustering; dialogue system; lexical information; online learning scenario; semantic information; similarity function; statistical surface realiser; unlabelled data; Accuracy; Clustering algorithms; Labeling; Measurement; Semantics; Supervised learning; Training; dialogue systems; semantic slot labelling; surface realisation; unsupervised and supervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop (SLT), 2014 IEEE
  • Type

    conf

  • DOI
    10.1109/SLT.2014.7078559
  • Filename
    7078559