• DocumentCode
    167654
  • Title

    SOM Clustering Using Spark-MapReduce

  • Author

    Sarazin, Tugdual ; Azzag, Hanane ; Lebbah, Mustapha

  • Author_Institution
    ALTIC, Paris, France
  • fYear
    2014
  • fDate
    19-23 May 2014
  • Firstpage
    1727
  • Lastpage
    1734
  • Abstract
    In this paper, we consider designing clustering algorithms that can be used in MapReduce using Spark platform, one of the most popular programming environment for processing large datasets. We focus on the practical and popular serial Self-organizing Map clustering algorithm (SOM). SOM is one of the famous unsupervised learning algorithms and it´s useful for cluster analysis of large quantities of data. We have designed two scalable implementations of SOM-MapReduce algorithm. We report the experiments and demonstrated the performance in terms of classification accuracy, rand, speedup using real and synthetic data with 100 millions of points, using different cores.
  • Keywords
    pattern clustering; self-organising feature maps; unsupervised learning; MapReduce; SOM clustering; Spark platform; classification accuracy; self-organizing map clustering; unsupervised learning algorithm; Algorithm design and analysis; Clustering algorithms; Machine learning algorithms; Programming; Prototypes; Sparks; Vectors; Clustering; MapReduce; Self-Organizing Map; Spark;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
  • Conference_Location
    Phoenix, AZ
  • Print_ISBN
    978-1-4799-4117-9
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2014.192
  • Filename
    6969583