• DocumentCode
    2836106
  • Title

    Combining Parallel Self-Organizing Maps and K-Means to Cluster Distributed Data

  • Author

    Gorgônio, Flavius L. ; Costa, José Alfredo F

  • Author_Institution
    Fed. Univ. of Rio Grande do Norte, Natal
  • fYear
    2008
  • fDate
    16-18 July 2008
  • Firstpage
    53
  • Lastpage
    58
  • Abstract
    Clustering is the process of discovering groups within multidimensional data, based on similarities, with a minimal knowledge of their structure. In previous works, we presented an algorithm (partSOM) to cluster distributed datasets, based on self-organizing maps (SOM). This work extends this approach presenting a strategy for efficient cluster analysis in distributed databases using SOM and K-means. The proposed strategy applies SOM algorithm separately in each distributed dataset, relative to database vertical partitions, to obtain a representative subset of each local dataset. In the sequence, these representative subsets are sent to a central site, which performs a fusion of the partial results and applies SOM and K-means algorithms to obtain a final result. Experimental results are compared with traditional SOM and partSOM approaches for different datasets.
  • Keywords
    data handling; distributed databases; pattern clustering; self-organising feature maps; database vertical partitions; distributed data clustering; distributed databases; k-means; multidimensional data; parallel self-organizing maps; partSOM; Clustering algorithms; Data analysis; Data mining; Data privacy; Distributed databases; Partitioning algorithms; Pattern analysis; Self organizing feature maps; Signal analysis; Storage automation; Distributed data mining; distributed data clustering; self-organizing maps;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Science and Engineering Workshops, 2008. CSEWORKSHOPS '08. 11th IEEE International Conference on
  • Conference_Location
    San Paulo
  • Print_ISBN
    978-0-7695-3257-8
  • Type

    conf

  • DOI
    10.1109/CSEW.2008.65
  • Filename
    4625039