DocumentCode :
2828769
Title :
Data extraction and annotation for dynamic Web pages
Author :
Song, Hui ; Giri, Suraj ; Ma, Fanyuan
Author_Institution :
Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., China
fYear :
2004
fDate :
28-31 March 2004
Firstpage :
499
Lastpage :
502
Abstract :
Many Web sites contain large sets of pages generated dynamically using a common template. The structured data extracted from these pages with semantic annotation are valuable for information system. We proposed a system, ADeaD, to automatically extract data values from these Web pages and annotate the data schema. Experimental evaluation on a lot of real Web page collections indicates our algorithm correctly extracted data and annotated the data schema.
Keywords :
Web sites; information retrieval; semantic Web; Web sites; data extraction; dynamic Web pages; semantic annotation; structured data; template; wrapper generation; Computer science; Data mining; Databases; Graphical user interfaces; HTML; Humans; Information systems; Internet; Web pages; Writing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
e-Technology, e-Commerce and e-Service, 2004. EEE '04. 2004 IEEE International Conference on
Print_ISBN :
0-7695-2073-1
Type :
conf
DOI :
10.1109/EEE.2004.1287353
Filename :
1287353
Link To Document :
بازگشت