• DocumentCode
    2035210
  • Title

    Scaling k-medoid algorithm for clustering large categorical dataset and its performance analysis

  • Author

    Joshi, Ritesh ; Patidar, Anil ; Mishra, Surendra

  • Author_Institution
    MCA, MITM, Indore, India
  • Volume
    2
  • fYear
    2011
  • fDate
    8-10 April 2011
  • Firstpage
    117
  • Lastpage
    121
  • Abstract
    Scalable data mining algorithms have become crucial to efficiently support KDD processes on large datasets. The k-medoid is one of the partitioning algorithms used for the purpose of clustering. We show that basic k-medoid algorithm is very much time consuming for large dataset. Instead we present the advanced algorithm which performs much better than known algorithm. In addition to presenting detailed experimental results for advanced k-medoid algorithm, we also conduct an experimental study with real life data sets to demonstrate the effectiveness of our technique. We address the task of scaling up k-medoids based algorithm through the utilization of memoization technique. Experimental results based on several datasets, including synthetic and real data, show that the proposed algorithm may reduce the number of distance calculations by a factor of more lhan a thousand limes when compared to existing algorithms while producing clusters of comparable quality.
  • Keywords
    data mining; optimisation; pattern clustering; KDD process; categorical dataset clustering; k-medoid algorithm; memoization technique; partitioning algorithm; scalable data mining algorithm; Algorithm design and analysis; Clustering algorithms; Complexity theory; Data mining; Indexes; Machine learning algorithms; Partitioning algorithms; Categorical Dataset; Clustering; K-medoid; Memoization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electronics Computer Technology (ICECT), 2011 3rd International Conference on
  • Conference_Location
    Kanyakumari
  • Print_ISBN
    978-1-4244-8678-6
  • Electronic_ISBN
    978-1-4244-8679-3
  • Type

    conf

  • DOI
    10.1109/ICECTECH.2011.5941667
  • Filename
    5941667