DocumentCode
655276
Title
Combining Multiple Features for Web Data Sources Clustering
Author
Algergawy, Alsayed ; Saake, Gunter
Author_Institution
Dept. of Comput. Sci., Otto-von-Guericke Univ., Magdeburg, Germany
fYear
2013
fDate
11-13 Sept. 2013
Firstpage
213
Lastpage
218
Abstract
The numbers of web data sources grow significantly, and as a sequence, crucial data management issues should be addressed. Clustering is one of the issues that many researchers have focused on. Clustering has been proposed to improve the information availability. To this end, in this paper, we propose a feature-based clustering approach for clustering web data sources without any human intervention and based only on features extracted from the source schemas. In particular, we combine linguistic and structure features of each data source to enhance computation of schema similarity. We experimentally demonstrate the effectiveness of the proposed approach in terms of both the clustering quality and runtime.
Keywords
Internet; data handling; pattern clustering; Web data sources clustering; crucial data management; feature-based clustering; information availability; linguistic; multiple features; structure features; Clustering algorithms; Feature extraction; Power capacitors; Pragmatics; Semantics; Vectors; XML;
fLanguage
English
Publisher
ieee
Conference_Titel
e-Business Engineering (ICEBE), 2013 IEEE 10th International Conference on
Conference_Location
Coventry
Type
conf
DOI
10.1109/ICEBE.2013.32
Filename
6686265
Link To Document