DocumentCode :
2696218
Title :
Large-scale document similarity computation based on cloud computing platform
Author :
He, Chaobo ; Tang, Yong ; Tang, Feiyi ; Yang, Atiao
Author_Institution :
Dept. of Comput., South China Normal Univ., Guangzhou, China
fYear :
2011
fDate :
26-28 Oct. 2011
Firstpage :
175
Lastpage :
179
Abstract :
Low efficiency existing in the current approaches for large scale document similarity computation, to make an improvement we pinpointed a new approach based on cloud computing platform in this paper. The approach carried out document similarity computation based on traditional vector model space as well as applied MapReduce computation model to realize the parallelization of distributed inverted index and similarity computation. In this paper we first discussed the traditional approaches´ disadvantages, and then presented the structure of distributed inverted index, the architecture of cloud computing platform and the core algorithms based on MapReduce computation model. Last we made some related experiments. Using this approach, large scale document similarity computation can be run more effectively and had more scalability as well.
Keywords :
cloud computing; document handling; MapReduce computation model; cloud computing platform; distributed inverted index; large scale document similarity computation; vector model space; Computational modeling; DSL; Indexes; Monitoring; cloud computing; distributed inverted index; document similarity; map-reduce; vector space model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pervasive Computing and Applications (ICPCA), 2011 6th International Conference on
Conference_Location :
Port Elizabeth
Print_ISBN :
978-1-4577-0209-9
Type :
conf
DOI :
10.1109/ICPCA.2011.6106499
Filename :
6106499
Link To Document :
بازگشت