Title :
Extracting and Clustering Method of Web Bipartite Cores
Author :
Yang, Nan ; Ding, Hui ; Liu, Yue
Author_Institution :
Inf. Sch., Rennin Univ. of China, Beijing, China
Abstract :
The paper focuses on some key problems in Web communities´ discovery. Based on topic-oriented communities discovery, we analyze some insufficiencies of CBG (complete bipartite graph) in trawling method. The conception of x-core-set is introduced, instead of CBG, it is more reasonable as a signature of core of community. We construct a bipartite graph from a node x and then (i, j)pruning the graph to obtain x-cores-set. By scanning topic subgraph, we can extract a set of x-cores-sets. Finally, a hierarchal clustering algorithm is applied to these x-cores-sets and the dendrogram of community is formed. We proved that x-cores-set, consisted of x-cores, can be calculated by a bipartite graph collected from x and (i, j)pruning. The experiment is set up on the dataset that is same as that in HITS method, except for returned pages are integrated from 4 search engines. The result shows that our algorithm is effective and efficient.
Keywords :
Internet; graph theory; information retrieval; pattern clustering; search engines; HITS method; Web bipartite cores; Web communities discovery; clustering algorithm; clustering method; complete bipartite graph; extracting method; search engines; topic-oriented communities discovery; trawling method; x-core-set; Algorithm design and analysis; Bipartite graph; Clustering algorithms; Communities; Earth Observing System; Fans; Search engines; bipartite cores; hiearachical clustering; web communities;
Conference_Titel :
Web Information Systems and Applications Conference (WISA), 2010 7th
Conference_Location :
Hohhot
Print_ISBN :
978-1-4244-8440-9
DOI :
10.1109/WISA.2010.40