DocumentCode :
2015348
Title :
Visualization and Integration of Databases Using Self-Organizing Map
Author :
Bourennani, Farid ; Pu, Ken Q. ; Zhu, Ying
Author_Institution :
Univ. of Ontario Inst. of Technol., Oshawa, ON
fYear :
2009
fDate :
1-6 March 2009
Firstpage :
155
Lastpage :
160
Abstract :
With the growing computer networks, accessible data is becoming increasingly distributed. Understanding and integrating remote and unfamiliar data sources are important data management issues. In this paper, we propose to utilize self-organizing maps (SOM) clustering to aid with the visualization of similar columns, and integration of relational database tables and attributes based on the content. In order to accommodate heterogeneous data types found in relational databases, we extended the TFIDF measure to handle, in addition to text, numerical attribute types for coincident meaning extraction. We present a SOM clustering based visualization algorithm allowing the user to browse the heterogeneously typed database attributes and discover semantically similar clusters. Additionally, we propose a new algorithm Common Item Based Classifier (CIBC) to smoothen the homogeneity of the clusters obtained by SOM. The discovered semantic clusters can significantly aid in manual or automated constructions of data integrity constraints in data cleaning or schema mappings in data integration.
Keywords :
data integrity; data mining; data visualisation; distributed databases; pattern classification; pattern clustering; query processing; relational databases; self-organising feature maps; common item based classifier algorithm; data integrity constraint; data visualization algorithm; distributed database browsing; heterogeneous data type; numerical data mining; relational database table; self-organizing map clustering; Application software; Clustering algorithms; Computer network management; Data mining; Data visualization; Distributed databases; Information retrieval; Relational databases; Self organizing feature maps; Visual databases; Common Item Based Classifier (CIBC); Data Integration; Information Retrieval (IR); SOM;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Databases, Knowledge, and Data Applications, 2009. DBKDA '09. First International Conference on
Conference_Location :
Gosier
Print_ISBN :
978-1-4244-3467-1
Electronic_ISBN :
978-0-7695-3550-0
Type :
conf
DOI :
10.1109/DBKDA.2009.30
Filename :
5071828
Link To Document :
بازگشت