DocumentCode :
3055813
Title :
Web Document Clustering Research Based on Granular Computing
Author :
Shangzhi, Zheng ; Xiaolong, Zhao ; Buqun, Zhang ; Hualong, Bu
Author_Institution :
Dept. of Comput. Sci. & Technol., Chaohu Univ., Chaohu, China
Volume :
2
fYear :
2009
fDate :
22-24 May 2009
Firstpage :
446
Lastpage :
450
Abstract :
In this paper, a method of Web document clustering based on granular computing (WDCGrc) is presented. The method computes the weight value of the words in documents by adopting the TF-IDF principle. Meanwhile, combinative ways defining documents threshold and average weight value are adopted to reduce dimensions and extract the keywords in each document. The paper establishes the transformation between the keywords in documents and the binary granules, and adopts the algorithm of association rules based on granular computing to obtain frequent item sets between documents. Bring in the set theory thought, numbers of the same word between documents as the document similarity and the clustering result is obtained. The experiment shows that the method is practical and feasible, with good quality of clustering.
Keywords :
Internet; data mining; document handling; pattern clustering; set theory; TF-IDF principle; WDCGrc; Web document clustering; association rule; average weight value; binary granule; dimension reduction; document keyword; document threshold; document word; granular computing; set theory; Association rules; Chaos; Clustering algorithms; Computer science; Computer security; Data mining; Electronic commerce; Information processing; Internet; Web pages; Association rules; Clustering; Granularcomputing; Web documents;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electronic Commerce and Security, 2009. ISECS '09. Second International Symposium on
Conference_Location :
Nanchang
Print_ISBN :
978-0-7695-3643-9
Type :
conf
DOI :
10.1109/ISECS.2009.16
Filename :
5209712
Link To Document :
بازگشت