Title :
An Efficient and Fast Parzen-Window Density Based Clustering Method for Large Data Sets
Author :
Suresh Babu, V. ; Viswanath, P.
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Guwahati
Abstract :
Density based clustering technique like DBSCAN finds arbitrary shaped clusters along with noisy outliers. DBSCAN finds the density at a point by counting the number of points falling in a sphere of radius epsi and it has a time complexity of O(n2). Hence it is not suitable for large data sets. The proposed method in this paper is an efficient and fast Parzen-Window density based clustering method which uses (i) prototypes to reduce the computational burden, (ii) a smooth kernel function to estimate density at a point and hence the estimated density is also varies smoothly. Enriched prototypes are derived using counted leaders method. These are used with a special form of the Gaussian kernel function which is radially symmetrical and hence the function can be completely specified by a variance parameter only. The proposed method is experimentally compared with DBSCAN which shows that it is a suitable method for large data sets.
Keywords :
Gaussian processes; pattern clustering; very large databases; Gaussian kernel function; Parzen-window density estimation based clustering method; counted leaders method; large data set; smooth kernel function; variance parameter; Clustering methods; Computer science; Data engineering; Kernel; Noise shaping; Prototypes; Shape; DBSCAN; Leaders clustering method; Parzen-WIndow; Prototypes;
Conference_Titel :
Emerging Trends in Engineering and Technology, 2008. ICETET '08. First International Conference on
Conference_Location :
Nagpur, Maharashtra
Print_ISBN :
978-0-7695-3267-7
Electronic_ISBN :
978-0-7695-3267-7
DOI :
10.1109/ICETET.2008.166