• DocumentCode
    683514
  • Title

    Implementation of projected clustering based on SQL queries and UDFs in relational databases

  • Author

    Harikumar, Sandhya ; Haripriya, H. ; Kaimal, M.R.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Amrita Vishwa Vidyapeetham, Kollam, India
  • fYear
    2013
  • fDate
    19-21 Dec. 2013
  • Firstpage
    7
  • Lastpage
    12
  • Abstract
    Projected clustering is one of the clustering approaches that determine the clusters in the subspaces of high dimensional data. Although it is possible to efficiently cluster a very large data set outside a relational database, the time and effort to export and import it can be significant. In commercial RDBMSs, there is no SQL query available for any type of subspace clustering, which is more suitable for large databases with high dimensions and large number of records. Integrating clustering with a relational DBMS using SQL is an important and challenging problem in todays world of Big Data. Projected clustering has the ability to find the closely correlated dimensions and find clusters in the corresponding subspaces. We have designed an SQL version of projected clustering which helps to get the clusters of the records in the database using a single SQL statement which in itself calls other SQL functions defined by us. We have used PostgreSQL DBMS to validate our implementation and have done experimentation with synthetic as well as real data.
  • Keywords
    Big Data; SQL; pattern clustering; relational databases; very large databases; Big Data; PostgreSQL DBMS; SQL queries; UDFs; commercial RDBMSs; high dimensional data; large databases; projected clustering; relational databases; very large data set clustering; Clustering algorithms; Data handling; Data storage systems; Information management; Relational databases; Standards;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Computational Systems (RAICS), 2013 IEEE Recent Advances in
  • Conference_Location
    Trivandrum
  • Print_ISBN
    978-1-4799-2177-5
  • Type

    conf

  • DOI
    10.1109/RAICS.2013.6745438
  • Filename
    6745438