DocumentCode :
3073134
Title :
Similarity Computation of Web Pages of Focused Crawler
Author :
Yu, Huo Ling ; Bingwu, Liu ; Fang, Yan
Author_Institution :
Sch. of Inf., Beijing Wuzi Univ., Beijing, China
Volume :
2
fYear :
2010
fDate :
16-18 July 2010
Firstpage :
70
Lastpage :
72
Abstract :
Due to the dynamic nature of the Web, it becomes harder to find relevant and recent information. More and more people begin to use focused crawler to get information in their special fields today. However, the Similarity Computation based text is incompetent, because the page consists of not only text but also multimedia contents, such as image, audio, video and so on. In the field of the focused crawler the page structure plays a key role in the similarity computation too. In this paper we introduce a new method to have similarity computation according the page structure and content which can make web page similarity computation exactly and crawling efficiently which will bring benefits for Web analysis and get information easily for users.
Keywords :
Internet; Web analysis; Web page similarity computation; page content; page structure; similarity computation; Computational modeling; Crawlers; HTML; Head; Indexing; Shape; Web pages; Content Similarity; Focused Crawler; Page Structure; Similarity computation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology and Applications (IFITA), 2010 International Forum on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-7621-3
Electronic_ISBN :
978-1-4244-7622-0
Type :
conf
DOI :
10.1109/IFITA.2010.308
Filename :
5634920
Link To Document :
بازگشت