• DocumentCode
    2547926
  • Title

    WebMiner--Anatomy of Super Peer Based Incremental Topic-Specific Web Crawler

  • Author

    Vikas, Om ; Chiluka, Nitin J. ; Ray, Purushottam K. ; Meena, Girraj ; Meshram, Akhil K. ; Gupta, Amit ; Sisodia, Abhishek

  • Author_Institution
    Indian Inst. of Inf. Technol. & Manage., Gwalior
  • fYear
    2007
  • fDate
    22-28 April 2007
  • Firstpage
    32
  • Lastpage
    32
  • Abstract
    This paper introduces "WebMiner", a super-peer based P2P system for building an incremental topic-specific Web crawler. This develops a topic-based repository of Web pages that would later be used in the construction of ontologies. Current crawlers suffer from centralized architecture, having single point of failure and heavy load. Super-peer systems strike a balance between the inherent efficiency of centralized search and the autonomity, load balancing and robustness to attacks, provided by distributed search, with heterogeneity of capabilities across peers. In this paper, we discuss the architecture of WebMiner in detail including the construction of the super-peer overlay network and the working of the system, which includes feature of crawling the hidden Web.
  • Keywords
    Web sites; data mining; ontologies (artificial intelligence); semantic Web; P2P system; Web crawler; WebMiner; hidden Web; ontologies; super-peer; Anatomy; Crawlers; Information management; Information technology; Load management; Ontologies; Search engines; Service oriented architecture; Uniform resource locators; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Networking, 2007. ICN '07. Sixth International Conference on
  • Conference_Location
    Martinique
  • Print_ISBN
    0-7695-2805-8
  • Electronic_ISBN
    0-7695-2805-8
  • Type

    conf

  • DOI
    10.1109/ICN.2007.104
  • Filename
    4196225