• DocumentCode
    468246
  • Title

    MixPR-An Approach of Combining Content and Links of Web Page

  • Author

    Guo, Ye

  • Author_Institution
    Xi´´an Univ. of Finance & Econ., Xi´´an
  • Volume
    2
  • fYear
    2007
  • fDate
    24-27 Aug. 2007
  • Firstpage
    456
  • Lastpage
    460
  • Abstract
    Pagerank was used in systems based on hyperlink structure such as Google. TFIDF was widely used in IR systems based on the vector space model (VSM). It was significative to combine the advantages of two systems. In this paper, we set up a new model by using the content of Web pages and the links among pages. We set up the transition probability matrix, which composed of link information and the relevant value of pages with the given query. The relevant value was denoted by TFIDF. We got the MixPR (mixed pagerank) by solving the equation with the coefficient of matrix. In this model, part of the pages, which would be used to compute the TFIDF, had been downloaded from the Internet firstly, and the link information which started from those pages was stored in local server, too. The importance of the page was determined by content and the links. Experimental results showed that the new model worked well, and the precision approached to the result of the TFIDF did.
  • Keywords
    Internet; information retrieval; search engines; Google; Pagerank; Web page; hyperlink structure; transition probability matrix; vector space model; Content based retrieval; Databases; Delay; Equations; Finance; Internet; Search engines; Web pages; Web search; Web server;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
  • Conference_Location
    Haikou
  • Print_ISBN
    978-0-7695-2874-8
  • Type

    conf

  • DOI
    10.1109/FSKD.2007.407
  • Filename
    4406120