DocumentCode :
2730196
Title :
Annotating Structured Data of the Deep Web
Author :
Yiyao Lu ; Hai He ; Hongkun Zhao ; Weiyi Meng ; Yu, Chu
Author_Institution :
State Univ. of New York, Binghamton, NY, USA
fYear :
2007
fDate :
15-20 April 2007
Firstpage :
376
Lastpage :
385
Abstract :
An increasing number of databases have become Web accessible through HTML form-based search interfaces. The data units returned from the underlying database are usually encoded into the result pages dynamically for human browsing. For the encoded data units to be machine processable, which is essential for many applications such as deep Web data collection and comparison shopping, they need to be extracted out and assigned meaningful labels. In this paper, we present a multi-annotator approach that first aligns the data units into different groups such that the data in the same group have the same semantics. Then for each group, we annotate it from different aspects and aggregate the different annotations to predict a final annotation label. An annotation wrapper for the search site is automatically constructed and can be used to annotate new result pages from the same site. Our experiments indicate that the proposed approach is highly effective.
Keywords :
Internet; query processing; HTML form-based search interfaces; annotation wrapper; deep Web data collection; multiannotator approach; structured data; Aggregates; Books; Data mining; HTML; Helium; Humans; Information retrieval; Relational databases; Search engines; Spatial databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
Conference_Location :
Istanbul
Print_ISBN :
1-4244-0802-4
Type :
conf
DOI :
10.1109/ICDE.2007.367883
Filename :
4221686
Link To Document :
بازگشت