Title :
Performance analysis of MK-means clustering algorithm with normalization approach
Author :
Patel, Vaishali Rajeev ; Mehta, Rupa G.
Author_Institution :
Dept. of Comput. Eng., Shri S´´ad Vidhya Mandal Inst. of Technol., Bharuch, India
Abstract :
Real world applications are increasingly growing in the field of science and engineering, where data mining is an important stage to relate research and applications. Data objects are clustered based on the similarity using unsupervised learning techniques. The incomplete, noisy and inconsistent data may slow down the knowledge discovery in database process. Data preprocessing techniques improve the quality of data, thereby helping to improve the accuracy and efficiency of the subsequent mining processes. Data cleaning is an important preprocessing task to avoid redundancies during data integration. Normalization is an additional data preprocessing task that would contribute towards the success of the data mining process. In normalization the data to be analyzed is scaled to a specific range. K-means is the well known partition based clustering algorithm, yet it suffers from shortcomings of passing number of clusters and initial centroids preliminary. This paper proposes modified K-means algorithm (MK-means) which provides a solution for automatic initialization of centroids and analyzes the performance of MK-means algorithm with integration of cleaning method and normalization techniques which shows the improvement in the performance of MK-means algorithm.
Keywords :
data mining; pattern classification; unsupervised learning; MK-means clustering algorithm; centroid automatic initialization; data cleaning; data integration; data mining; data objects; data preprocessing techniques; database process; knowledge discovery; normalization approach; partition based clustering algorithm; performance analysis; unsupervised learning techniques; Algorithm design and analysis; Classification algorithms; Cleaning; Clustering algorithms; Data mining; Data preprocessing; Partitioning algorithms; Clustering Techniques; Data Mining; K-means; Normalization; Preprocessing;
Conference_Titel :
Information and Communication Technologies (WICT), 2011 World Congress on
Conference_Location :
Mumbai
Print_ISBN :
978-1-4673-0127-5
DOI :
10.1109/WICT.2011.6141380