DocumentCode :
655276
Title :
Combining Multiple Features for Web Data Sources Clustering
Author :
Algergawy, Alsayed ; Saake, Gunter
Author_Institution :
Dept. of Comput. Sci., Otto-von-Guericke Univ., Magdeburg, Germany
fYear :
2013
fDate :
11-13 Sept. 2013
Firstpage :
213
Lastpage :
218
Abstract :
The numbers of web data sources grow significantly, and as a sequence, crucial data management issues should be addressed. Clustering is one of the issues that many researchers have focused on. Clustering has been proposed to improve the information availability. To this end, in this paper, we propose a feature-based clustering approach for clustering web data sources without any human intervention and based only on features extracted from the source schemas. In particular, we combine linguistic and structure features of each data source to enhance computation of schema similarity. We experimentally demonstrate the effectiveness of the proposed approach in terms of both the clustering quality and runtime.
Keywords :
Internet; data handling; pattern clustering; Web data sources clustering; crucial data management; feature-based clustering; information availability; linguistic; multiple features; structure features; Clustering algorithms; Feature extraction; Power capacitors; Pragmatics; Semantics; Vectors; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
e-Business Engineering (ICEBE), 2013 IEEE 10th International Conference on
Conference_Location :
Coventry
Type :
conf
DOI :
10.1109/ICEBE.2013.32
Filename :
6686265
Link To Document :
بازگشت