DocumentCode :
1863401
Title :
Empirical analysis of grouping web pages using vector space model for link structures
Author :
Sasaki, Yuichi ; Kurihara, Masahito
Author_Institution :
Res. Group of Complex Syst. Eng., Hokkaido Univ., Sapporo
fYear :
2008
fDate :
25-27 June 2008
Firstpage :
188
Lastpage :
193
Abstract :
Several kinds of vector space models for analyzing document similarity for grouping Web pages have been developed. However, they are not used for analyzing link structures, partly because they are complex and links do not necessarily satisfy the similarity relation. If we can devise vector space models for link structures, we can combine them with those models for document similarity in order to develop the unified basis for grouping Web pages. In this paper, we present a vector space model for link structures, based on the notion of link vectors, the specifically designed characteristic vectors for link structures. We also discuss the extension of this model to the model called content-link vector space model, which can treat document information and link information of Web pages in a unified way. The preliminary experiments show that the models show good performance even when document information is ignored.
Keywords :
Internet; document handling; vectors; Web page grouping; content-link vector space model; document information; document similarity; link structures; Bipartite graph; Computer applications; Frequency; Functional analysis; Information analysis; Information science; Internet; Space technology; Systems engineering and theory; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Soft Computing in Industrial Applications, 2008. SMCia '08. IEEE Conference on
Conference_Location :
Muroran
Print_ISBN :
978-1-4244-3782-5
Electronic_ISBN :
978-4-9904-2590-6
Type :
conf
DOI :
10.1109/SMCIA.2008.5045958
Filename :
5045958
Link To Document :
بازگشت