DocumentCode :
2544691
Title :
Web Page Recognition Algorithm Based on Link Analysis in Theme Search Engine
Author :
Zude Chen ; Jianxun Liu ; Haijun Zhai ; Lei Jiang ; Buqing Cao
Author_Institution :
Dept. of Comput. Sci. & Eng., Hunan Univ. of Sci. & Technol., Xiangtan, China
fYear :
2012
fDate :
1-3 Nov. 2012
Firstpage :
405
Lastpage :
409
Abstract :
Web page recognition is a problem in the design of web crawler in theme search engine. This paper designs a web page recognition algorithm based on link analysis to solve this problem. The main idea of this algorithm is to get the relevant web page recognition model through a combination of link analysis and theme URL knowledge base, based on the idea of statistics and social network analysis. Through the experiment, the precision rate of this algorithm is over 93 percent, and the recall rate is up to 85.4 percent. So the experiment is significant, better than other web page recognition algorithm. Experimental results show the feasibility and effectiveness of this algorithm.
Keywords :
Web design; pattern recognition; search engines; statistics; Web page recognition algorithm; link analysis; social network analysis; statistics; theme URL knowledge base; theme search engine; web crawler design; Algorithm design and analysis; Classification algorithms; Crawlers; Feature extraction; Knowledge based systems; Search engines; Web pages; Link analysis; Theme knowledge recognition; Theme search engine; Web page recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cloud and Green Computing (CGC), 2012 Second International Conference on
Conference_Location :
Xiangtan
Print_ISBN :
978-1-4673-3027-5
Type :
conf
DOI :
10.1109/CGC.2012.42
Filename :
6382848
Link To Document :
بازگشت