• DocumentCode
    3155985
  • Title

    Search-Engine-Oriented Theme Crawler Design

  • Author

    Dong, Qin

  • Author_Institution
    Yancheng Inst. of Technol., Yancheng, China
  • Volume
    2
  • fYear
    2010
  • fDate
    12-14 Nov. 2010
  • Firstpage
    303
  • Lastpage
    306
  • Abstract
    A theme crawler is the most important part of a vertical search engine. To recall web pages efficiently and accurately, the design work of theme crawler was studied in this paper. Seed link and similarity measurement are two key techniques for a theme crawler, which are explained in detail in this paper. And the relevant program codes and algorithm were provided to explain there two techniques clearly. The process of a theme crawler begins from fetching seed links, host search engine, interface of search engine and fetch link were illustrated in the paper. To improve the efficiency of crawler, a model of page evaluation was added to the crawler module.
  • Keywords
    search engines; page evaluation; program codes; theme crawler; vertical search engine; Arrays; Crawlers; Engines; Google; Search engines; Transforms; Web pages; page evaluation; theme crawler; vertical search engine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    System Science, Engineering Design and Manufacturing Informatization (ICSEM), 2010 International Conference on
  • Conference_Location
    Yichang
  • Print_ISBN
    978-1-4244-8664-9
  • Type

    conf

  • DOI
    10.1109/ICSEM.2010.169
  • Filename
    5640213