Title :
A factorization network based method for multi-lingual domain classification
Author :
Yangyang Shi ; Yi-Cheng Pan ; Mei-Yuh Hwang ; Kaisheng Yao ; Hu Chen ; Yuanhang Zou ; Baolin Peng
Author_Institution :
Microsoft, Redmond, WA, USA
Abstract :
In many spoken language understanding systems (SLUS), domain classification is the most crucial component, as system responses based on wrong domains often yield very unpleasant user experiences. In multi-lingual domain classification, the training data for some poor-resource languages often comes from machine translation. Some of the higher order n-gram features are distorted during machine translation. Feature co-occurrence becomes reliable feature in multi-lingual domain classification. In this paper, in order to effectively model feature co-occurrences, we propose Factorization Networks that are combinations of Factorization Machines (FMs) with Neural Networks (NNs). FNs extend the linear connections from the input feature layer to the hidden layer in NNs to factorization connections that represent the weights of feature co-occurrences using factorized method. In addition to FNs, we also propose a hybrid model that integrates FNs, NNs and Maximum Entropy (ME) models together. The component models in the hybrid model share the same input features. Based on two data sets (ATIS data set and Microsoft Cortana Chinese data ), the proposed models shows promising results. Especially for large Microsoft Cortana Chinese data which is translated from well annotated English data, FNs using unigram, class and query length features achieve more than 20% relative error reduction over linear (SVMs).
Keywords :
language translation; neural nets; pattern classification; ATIS data set; Microsoft Cortana Chinese data; factorization machines; factorization network; feature cooccurrence; higher order n-gram features; linear connections; machine translation; maximum entropy; multilingual domain classification; neural networks; spoken language understanding system; Artificial neural networks; Error analysis; Polynomials; Support vector machines; Training; Training data; Domain Classification; Factorization Networks; Spoken Language Understanding;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178978