DocumentCode :
2788496
Title :
A new topic-bridged model for transfer learning
Author :
Wu, Meng-Sung ; Chien, Jen-Tzung
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
5346
Lastpage :
5349
Abstract :
In real-world information systems, there are abundant unlabeled data but sparse labeled data. It is challenging to construct an adaptive model to classify a large amount of documents containing different domains. The classifiers trained from a source domain shall perform poorly for the test data in a target domain due to the domain mismatch. In this study, we build a topic-bridged latent Dirichlet allocation (TLDA) model from a variety of labeled and unlabeled documents and perform the transfer learning for document classification. The severe change of word distributions is compensated by bridging the latent topics of source and target data which are drawn by the Dirichlet priors. A variational inference procedure is performed for semi-supervised learning. In the experiments on text categorization using 20 Newsgroups dataset, the proposed TLDA model achieved higher classification performance compared to the other methods.
Keywords :
inference mechanisms; learning (artificial intelligence); pattern classification; text analysis; variational techniques; word processing; document classification; document classifier; newsgroup dataset; semisupervised learning; sparse labeled data; text categorization; topic bridged latent Dirichlet allocation; transfer learning; variational inference; Bayesian methods; Computer science; Data engineering; Knowledge transfer; Linear discriminant analysis; Predictive models; Semisupervised learning; Signal processing algorithms; Testing; Text categorization; Bayes procedures; pattern classification; text processing; text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5494947
Filename :
5494947
Link To Document :
بازگشت