DocumentCode
683514
Title
Implementation of projected clustering based on SQL queries and UDFs in relational databases
Author
Harikumar, Sandhya ; Haripriya, H. ; Kaimal, M.R.
Author_Institution
Dept. of Comput. Sci. & Eng., Amrita Vishwa Vidyapeetham, Kollam, India
fYear
2013
fDate
19-21 Dec. 2013
Firstpage
7
Lastpage
12
Abstract
Projected clustering is one of the clustering approaches that determine the clusters in the subspaces of high dimensional data. Although it is possible to efficiently cluster a very large data set outside a relational database, the time and effort to export and import it can be significant. In commercial RDBMSs, there is no SQL query available for any type of subspace clustering, which is more suitable for large databases with high dimensions and large number of records. Integrating clustering with a relational DBMS using SQL is an important and challenging problem in todays world of Big Data. Projected clustering has the ability to find the closely correlated dimensions and find clusters in the corresponding subspaces. We have designed an SQL version of projected clustering which helps to get the clusters of the records in the database using a single SQL statement which in itself calls other SQL functions defined by us. We have used PostgreSQL DBMS to validate our implementation and have done experimentation with synthetic as well as real data.
Keywords
Big Data; SQL; pattern clustering; relational databases; very large databases; Big Data; PostgreSQL DBMS; SQL queries; UDFs; commercial RDBMSs; high dimensional data; large databases; projected clustering; relational databases; very large data set clustering; Clustering algorithms; Data handling; Data storage systems; Information management; Relational databases; Standards;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Computational Systems (RAICS), 2013 IEEE Recent Advances in
Conference_Location
Trivandrum
Print_ISBN
978-1-4799-2177-5
Type
conf
DOI
10.1109/RAICS.2013.6745438
Filename
6745438
Link To Document