DocumentCode :
3426929
Title :
Research and design of the crawler system in a vertical search engine
Author :
Li, Min ; Zhao, Jun ; Huang, Tinglei
Author_Institution :
Sch. of Comput. Sci., Yangtze Univ., Jingzhou, China
fYear :
2010
fDate :
22-24 Oct. 2010
Firstpage :
790
Lastpage :
792
Abstract :
The crawler system in a vertical search engine should format a representative sample web page so at to make sure that the page could meet the W3C standards, which make it available that the processed page can be resolved by the visual XPath generator and then the desired XPath value will be found out. In batch-data-extraction, some exact data will be available when object web pages are parsed by the crawler system. A vertical search engine can extract the necessary data and segment Chinese words at first, and then the data will be presented on web pages. The data structuring process after the data extraction distinguishes a vertical search engine from a traditional search engine. The crawler system that can extract professional information on the Internet and process the information preliminarily is an indispensable part of a vertical search engine.
Keywords :
Internet; search engines; Chinese words; Internet; W3C standards; Web page; batch-data-extraction; crawler system; data extraction; data structuring process; vertical search engine; visual XPath generator; Artificial intelligence; Educational institutions; Engines; Search engines; Standards; Vertical search engine; crawler system; data extraction; data structuring process; visual XPath generator; word segmentation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Computing and Integrated Systems (ICISS), 2010 International Conference on
Conference_Location :
Guilin
Print_ISBN :
978-1-4244-6834-8
Type :
conf
DOI :
10.1109/ICISS.2010.5657110
Filename :
5657110
Link To Document :
بازگشت