DocumentCode
468246
Title
MixPR-An Approach of Combining Content and Links of Web Page
Author
Guo, Ye
Author_Institution
Xi´´an Univ. of Finance & Econ., Xi´´an
Volume
2
fYear
2007
fDate
24-27 Aug. 2007
Firstpage
456
Lastpage
460
Abstract
Pagerank was used in systems based on hyperlink structure such as Google. TFIDF was widely used in IR systems based on the vector space model (VSM). It was significative to combine the advantages of two systems. In this paper, we set up a new model by using the content of Web pages and the links among pages. We set up the transition probability matrix, which composed of link information and the relevant value of pages with the given query. The relevant value was denoted by TFIDF. We got the MixPR (mixed pagerank) by solving the equation with the coefficient of matrix. In this model, part of the pages, which would be used to compute the TFIDF, had been downloaded from the Internet firstly, and the link information which started from those pages was stored in local server, too. The importance of the page was determined by content and the links. Experimental results showed that the new model worked well, and the precision approached to the result of the TFIDF did.
Keywords
Internet; information retrieval; search engines; Google; Pagerank; Web page; hyperlink structure; transition probability matrix; vector space model; Content based retrieval; Databases; Delay; Equations; Finance; Internet; Search engines; Web pages; Web search; Web server;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
Conference_Location
Haikou
Print_ISBN
978-0-7695-2874-8
Type
conf
DOI
10.1109/FSKD.2007.407
Filename
4406120
Link To Document