DocumentCode
477785
Title
Effective Schema Extraction of Query Interfaces on the Deep Web
Author
Qiang, Bao-hua ; Xi, Jian-qing ; Chen, Ling
Author_Institution
Sch. of Comput. Sci. & Eng., South China Univ. of Technol., Guangzhou
Volume
2
fYear
2008
fDate
18-20 Oct. 2008
Firstpage
291
Lastpage
295
Abstract
The Deep Web is becoming a very important information resource. Unlike the traditional Web information retrieval, the contents on the Deep Web are only accessible through source query interfaces. However, for any domain of interest, there may be so many query interfaces that users need to access them in order to get the desired information, which is time-consuming and requires to build an integrated query interface over the sources. The first important task towards this goal is schema extraction of source query interface. In this paper, we will present a novel pre-clustering algorithm with proper grouping patterns to obtain partial clustering of attributes. Our approach can avoid obtaining the incorrect subsets when grouping attributes. The experimental results showed our approach is highly effective on schema extraction of source query interfaces on the Deep Web.
Keywords
Internet; query formulation; Deep Web; information resource; information retrieval; pre-clustering algorithm; query interfaces; schema extraction; Cities and towns; Computer science; Content based retrieval; Data mining; Fuzzy systems; Information resources; Information retrieval; Knowledge engineering; Merging; Tree graphs; Deep Web; Pre-clustering algorithm; Query interface; Schema extraction;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location
Shandong
Print_ISBN
978-0-7695-3305-6
Type
conf
DOI
10.1109/FSKD.2008.135
Filename
4666125
Link To Document