Title :
Web documents clustering with interest links
Author :
Cui, Zifeng ; Xu, Baowen ; Zhang, Weifeng ; Xu, Junling
Abstract :
Web documents clustering is a kind of effective Web mining technique. This paper proposes a novel Web documents clustering algorithm from the perspective of Web usage through analyzing WWW cache, in which Web documents reflect user´s recent interests. According to the rich semantic information embedded in hyperlinks in Web documents, we first extracts hyperlinks from Web documents and the Web documents in WWW cache is modeled as an undirected Web graph in our approach. Then the clustering algorithm based on the Web graph model is given. Finally, Experimental results verify that the algorithm is efficient and feasible.
Keywords :
Internet; data mining; document handling; WWW cache analysis; Web document hyperlink; Web documents clustering algorithm; Web graph model; Web mining; Web usage; Web user interest link; World Wide Web; semantic information; Algorithm design and analysis; Clustering algorithms; Computer science; Greedy algorithms; Laboratories; Partitioning algorithms; Search engines; Software engineering; Web mining; World Wide Web;
Conference_Titel :
Service-Oriented System Engineering, 2005. SOSE 2005. IEEE International Workshop
Print_ISBN :
0-7695-2438-9
DOI :
10.1109/SOSE.2005.39