DocumentCode :
2949729
Title :
Designing and Implementing of the Webpage Information Extracting Model Based on Tags
Author :
Xu, Zhang ; Yan, Dong
Author_Institution :
Dept. of Inf., Peking Union Univ., Beijing, China
fYear :
2011
fDate :
20-21 Aug. 2011
Firstpage :
273
Lastpage :
275
Abstract :
In this article, a novel model of Webpage information extraction based on tags is presented. With the ingenious algorithm, the model preformed better than Html Parser and Jsoup in most cases. It can be a URL filter of the Net Crawler in order to enhance efficiency.
Keywords :
Web sites; information retrieval; search engines; URL filter; Web page information extracting model; net crawler; tags; Context; Data mining; HTML; Law; Search engines; Web pages; Html Parser; Html Tag; Jsoup; Webpage information extraction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligence Science and Information Engineering (ISIE), 2011 International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4577-0960-9
Electronic_ISBN :
978-0-7695-4480-9
Type :
conf
DOI :
10.1109/ISIE.2011.71
Filename :
5997433
Link To Document :
بازگشت