• DocumentCode
    390585
  • Title

    A predication-based approach for effective resource, discovery in topical web

  • Author

    Ma, Liang ; Chen, Qunxiu ; Wang, Jun ; Xu, Guowei ; Cai, Lianhong

  • Author_Institution
    Dept. of Comput. Sci., Tsinghua Univ., Beijing, China
  • Volume
    1
  • fYear
    2002
  • fDate
    28-31 Oct. 2002
  • Firstpage
    61
  • Abstract
    Due to enormous growth of the World Wide Web in recent years, crawling specific topical portions quickly without having to explore all Web pages has become a new challenge for resource discovery. A new idea is to predicate the URL´s relevance degree to the topic by related properties of the URL, then crawl the URLs with high probability. In this paper, we do further study on the topic resource and introduce some new properties helpful for more effective relevance predication. We also improve the evaluation algorithm and add two rules to adjust the weights of factors dynamically, which lead to better predication precision. These new issues improve the system performance due to higher topic harvest rate and lower sensitivity to various kinds of initial URL seeds.
  • Keywords
    Web sites; information retrieval; URL relevance degree predication; World Wide Web; effective resource discovery; topic harvest rate; topical Web; Aggregates; Computer science; Couplings; Crawlers; Data mining; Intelligent systems; Search engines; Uniform resource locators; Web pages; Web sites;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    TENCON '02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering
  • Print_ISBN
    0-7803-7490-8
  • Type

    conf

  • DOI
    10.1109/TENCON.2002.1181214
  • Filename
    1181214