DocumentCode :
3778691
Title :
Gaussian process based text categorization for healthy information
Author :
Sih-Huei Chen;Yuan-Shan Lee;Tzu-Chiang Tai;Jia-Ching Wang
Author_Institution :
Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan, R.O.C.
fYear :
2015
Firstpage :
30
Lastpage :
33
Abstract :
As the development of the medical technology, more and more people start to pay attention to their health. A large amount of health information can be easily obtained from the website now. Therefore, text categorization is important to analyze the information. In this work, we propose a system for text categorization that is based on a Gaussian process. Our proposed system involves the two parts- feature learning and classification. In the first part, we apply the latent Dirichlet allocation (LDA) to obtain the K latent topics proportion from each document. The K-dimensional vector is regarded as the feature of each document. In the classification part, a Gaussian process (GP) is utilized for the text categorization. 10 classes of text documents are categorized by the one-versus-one approach. The experimental results show that our proposed system performs well in text categorization, especially with the small size of training dataset.
Keywords :
"Text categorization","Testing","Gaussian processes","Classification algorithms","Feature extraction","Training","Training data"
Publisher :
ieee
Conference_Titel :
Orange Technologies (ICOT), 2015 International Conference on
Type :
conf
DOI :
10.1109/ICOT.2015.7498487
Filename :
7498487
Link To Document :
بازگشت