DocumentCode :
694736
Title :
New Features Acquisition of Text with Cloud-LDA Model
Author :
Maoyuan Zhang ; Fanli He ; Shuiyin Chen
Author_Institution :
Dept. of Comput. Sci. & Technol., Central China Normal Univ., Wuhan, China
fYear :
2013
fDate :
7-8 Dec. 2013
Firstpage :
267
Lastpage :
272
Abstract :
This paper probes into how to improve Information Retrieval by changing the feature distribution of the text. It introduces Cloud Model theory into Latent Dirichlet Allocation(LDA) Model and build a new feature selection system. LDA Model is used to mine the underlying topical structure. Each topic is associated with a multinomial distribution over words which are semantic related. But there is doubt that themes are relevant with each other in the light of semantics. Based on LDA model presented probability distribution of vocabulary in text, the new system with Cloud Model theory can automatically simulate feature set whose contribution degree is high in the text. Results show this feature set has less features but higher classification accuracy, thus obviously better than currently popular feature selection methods. If the query is matched to words with high contribution degree, the more these words are, the more relevant the article searched out is with the query. NTCIR-5 (the 5th NII Test Collection for IR Systems) collections of Experiment on SLIR (Single Language IR) show that this method achieves an obvious improvement compared with some other methods in IR.
Keywords :
cloud computing; feature extraction; information retrieval; statistical distributions; text analysis; Information Retrieval; LDA model; NTCIR-5; SLIR; cloud model theory; cloud-LDA model; feature selection system; latent Dirichlet allocation; probability distribution; single language IR; text features acquisition; Computational modeling; Data models; Feature extraction; Indexes; Information retrieval; Semantics; Uncertainty; Cloud Model; Information Retrieval; LDA model; feature;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Science and Cloud Computing Companion (ISCC-C), 2013 International Conference on
Conference_Location :
Guangzhou
Type :
conf
DOI :
10.1109/ISCC-C.2013.94
Filename :
6973603
Link To Document :
بازگشت