DocumentCode
2916945
Title
Schema Extraction of Deep Web Query Interface
Author
Wang, Ying ; Peng, Tao ; Zuo, Wanli ; Zhu, Huifeng
Author_Institution
Coll. of Comput. Sci. & Technol., Jilin Univ., Changchun, China
fYear
2009
fDate
7-8 Nov. 2009
Firstpage
391
Lastpage
395
Abstract
For integrating web databases, the very first challenge is to understand what a query interface says or what query capabilities a source supports. From the view of people, the interior structure of web pages is not concerned to for people. In the most cases, semantic block is identified via visual elements. Therefore, in this paper, a novel arithmetic of schema extraction based on visual features of pages has been designed to grasp and analyze attributes and query controls of pages. Firstly, judge query interface region by heuristic rules; Then, parse the interface region by analytic algorithm of pages; Lastly, deal with the query interface region to get logical attributes by visual features of pages, which are shown by a link list. Experiment result shows that this method has dramatically improved the extraction precision of query schema.
Keywords
Internet; query processing; Web database; Web pages; deep Web query interface; heuristic rules; query capabilities; query interface region; schema extraction; semantic block; visual features; Arithmetic; Books; Computer science; Data mining; Educational institutions; Information systems; Ontologies; Relational databases; Visual databases; Web pages; deep web; schema extration; visual features;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Information Systems and Mining, 2009. WISM 2009. International Conference on
Conference_Location
Shanghai
Print_ISBN
978-0-7695-3817-4
Type
conf
DOI
10.1109/WISM.2009.86
Filename
5369411
Link To Document