Title :
Active learning of EHVS parser for Persian language understanding
Author :
Tajgardoon, M.A. ; Jabbari, Fattaneh ; Sameti, Hossein ; Bahaadini, S.
Author_Institution :
Inf. Technol. Dept., Modiran Vehicle Manuf. (MVM), Tehran, Iran
Abstract :
One of the main elements of a spoken dialogue system is the Spoken Language Understanding (SLU) unit. Hidden Vector State (HVS) is one of the popular statistical methods applied to the SLU component. Extended Hidden Vector State (EHVS) is an enhanced version of the HVS. Although both parsers need only abstract data annotation, it is quiet time consuming and difficult to label the data. Thus, we present a novel active learning method for the EHVS parser to reduce the human labeling effort. The active learner makes use of pattern classification to select the informative data based on four different uncertainty measures. Experiments are done on a Persian dataset, the University Information Kiosk corpus. The experimental results show the improvements in performance of the active EHVS which has been improved 15.46% in the case of entropy-probability uncertainty measure. This reveals the effectiveness and feasibility of the proposed approach.
Keywords :
entropy; grammars; interactive systems; learning (artificial intelligence); natural language processing; pattern classification; statistical analysis; EHVS parser; Persian dataset; Persian language understanding; SLU unit; University Information Kiosk corpus; abstract data annotation; active EHVS; active learning method; entropy-probability uncertainty measure; extended hidden vector state; informative data; pattern classification; spoken dialogue system; spoken language understanding; statistical methods; uncertainty measures; Entropy; Measurement uncertainty; Semantics; Support vector machines; Training; Uncertainty; Vectors; Active EHVS; EHVS; Spoken language understanding; Uncertainty measure;
Conference_Titel :
Telecommunications (IST), 2012 Sixth International Symposium on
Conference_Location :
Tehran
Print_ISBN :
978-1-4673-2072-6
DOI :
10.1109/ISTEL.2012.6483100