DocumentCode :
174539
Title :
Semantic integration of heterogeneous relational schemas using multiple L1 linear regression and SVD
Author :
Harikumar, Sandhya ; Reethima, R. ; Kaimal, M.R.
Author_Institution :
Dept. of Comput. Sci. & Eng., Amrita Vishwa Vidyapeetham, Kollam, India
fYear :
2014
fDate :
26-28 Aug. 2014
Firstpage :
105
Lastpage :
111
Abstract :
The challenge of semantic integration of heterogeneous databases is one of the critical areas of interest due to scalability of data and the need to share the existing data as the technology advances. The schema level heterogeneity of the relations is the major issue for such integration. Though various approaches of schema analysis, transformation and integration have been explored, sometimes those become too general to solve the problem especially when the data is very high-dimensional and the schema information is unavailable or inadequate. In this paper, a method to integrate heterogeneous relational schema at instance-level is proposed, rather than the schema level. A global schema is designed consisting of the integration of most relevant attributes of different relational schema of a particular domain. In order to find the significant attributes, multiple linear regressions based on LI norm and Singular Value Decomposition(SVD) is applied on the data iteratively. This is a variant of L1-PCA, which is efficient, effective and meaningful method of linear subspace estimation. The most prominent instance - level similarity is found by finding the most significant attributes of each relational data source and then finding the similarity among those attributes using L1-norm. Thus an integrated schema is created that maps the relevant attributes of each local schema to a global schema.
Keywords :
distributed databases; principal component analysis; regression analysis; relational databases; singular value decomposition; Li-PCA; SVD; data iteratively; heterogeneous relational schemas; linear subspace estimation method; multiple L1 Linear Regression; multiple linear regressions; relational data source; semantic integration; singular value decomposition; Context; Databases; Linear regression; Principal component analysis; Qualifications; Semantics; Linear Regression; Relational Schemas; SVD; Semantic Integration;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Science & Engineering (ICDSE), 2014 International Conference on
Conference_Location :
Kochi
Print_ISBN :
978-1-4799-6870-1
Type :
conf
DOI :
10.1109/ICDSE.2014.6974620
Filename :
6974620
Link To Document :
بازگشت