• DocumentCode
    3183148
  • Title

    Performance analysis of MK-means clustering algorithm with normalization approach

  • Author

    Patel, Vaishali Rajeev ; Mehta, Rupa G.

  • Author_Institution
    Dept. of Comput. Eng., Shri S´´ad Vidhya Mandal Inst. of Technol., Bharuch, India
  • fYear
    2011
  • fDate
    11-14 Dec. 2011
  • Firstpage
    974
  • Lastpage
    979
  • Abstract
    Real world applications are increasingly growing in the field of science and engineering, where data mining is an important stage to relate research and applications. Data objects are clustered based on the similarity using unsupervised learning techniques. The incomplete, noisy and inconsistent data may slow down the knowledge discovery in database process. Data preprocessing techniques improve the quality of data, thereby helping to improve the accuracy and efficiency of the subsequent mining processes. Data cleaning is an important preprocessing task to avoid redundancies during data integration. Normalization is an additional data preprocessing task that would contribute towards the success of the data mining process. In normalization the data to be analyzed is scaled to a specific range. K-means is the well known partition based clustering algorithm, yet it suffers from shortcomings of passing number of clusters and initial centroids preliminary. This paper proposes modified K-means algorithm (MK-means) which provides a solution for automatic initialization of centroids and analyzes the performance of MK-means algorithm with integration of cleaning method and normalization techniques which shows the improvement in the performance of MK-means algorithm.
  • Keywords
    data mining; pattern classification; unsupervised learning; MK-means clustering algorithm; centroid automatic initialization; data cleaning; data integration; data mining; data objects; data preprocessing techniques; database process; knowledge discovery; normalization approach; partition based clustering algorithm; performance analysis; unsupervised learning techniques; Algorithm design and analysis; Classification algorithms; Cleaning; Clustering algorithms; Data mining; Data preprocessing; Partitioning algorithms; Clustering Techniques; Data Mining; K-means; Normalization; Preprocessing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Communication Technologies (WICT), 2011 World Congress on
  • Conference_Location
    Mumbai
  • Print_ISBN
    978-1-4673-0127-5
  • Type

    conf

  • DOI
    10.1109/WICT.2011.6141380
  • Filename
    6141380