DocumentCode
3155985
Title
Search-Engine-Oriented Theme Crawler Design
Author
Dong, Qin
Author_Institution
Yancheng Inst. of Technol., Yancheng, China
Volume
2
fYear
2010
fDate
12-14 Nov. 2010
Firstpage
303
Lastpage
306
Abstract
A theme crawler is the most important part of a vertical search engine. To recall web pages efficiently and accurately, the design work of theme crawler was studied in this paper. Seed link and similarity measurement are two key techniques for a theme crawler, which are explained in detail in this paper. And the relevant program codes and algorithm were provided to explain there two techniques clearly. The process of a theme crawler begins from fetching seed links, host search engine, interface of search engine and fetch link were illustrated in the paper. To improve the efficiency of crawler, a model of page evaluation was added to the crawler module.
Keywords
search engines; page evaluation; program codes; theme crawler; vertical search engine; Arrays; Crawlers; Engines; Google; Search engines; Transforms; Web pages; page evaluation; theme crawler; vertical search engine;
fLanguage
English
Publisher
ieee
Conference_Titel
System Science, Engineering Design and Manufacturing Informatization (ICSEM), 2010 International Conference on
Conference_Location
Yichang
Print_ISBN
978-1-4244-8664-9
Type
conf
DOI
10.1109/ICSEM.2010.169
Filename
5640213
Link To Document