مرکز منطقه ای اطلاع رساني علوم و فناوري - Clustering Large Databases in Distributed Environment

DocumentCode :

3073090

Title :

Clustering Large Databases in Distributed Environment

Author :

Pakhira, Malay K.

Author_Institution :

Kalyani Gov. Eng. Coll., Kalyani

fYear :

2009

fDate :

6-7 March 2009

Firstpage :

351

Lastpage :

358

Abstract :

In this article, a distributed clustering technique, that is suitable for dealing with large data sets, is presented. This algorithm is actually a modified version of the very common k-means algorithm with suitable changes for making it executable in a distributed environment. For large input size, the running time complexity of k-means algorithm is very high and is measured as O(TKN), where K is the number of desired clusters, T is the number of iterations, and N is the number of input patterns. The high time complexity of the serial k-means can be heavily reduced by executing it on a distributed parallel environment. Here, we shall describe a new distributed clustering algorithm and compared its performance with some other existing algorithms. Results of experiments show that this distributed approach can provide higher speedups and at the same time maintains all necessary characteristics of the serial k-means algorithm. We have successfully applied the new algorithm for clustering a number of data sets including a large satellite image data.

Keywords :

computational complexity; database theory; parallel algorithms; pattern clustering; very large databases; distributed clustering; distributed parallel environment; large data set; large database; serial k-means algorithm; time complexity; Clustering algorithms; Data engineering; Distributed computing; Distributed databases; Educational institutions; Ethernet networks; Government; Satellite broadcasting; Size measurement; Time measurement; clustering; complexity; distributed environment; k-means; large database;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Advance Computing Conference, 2009. IACC 2009. IEEE International

Conference_Location :

Patiala

Print_ISBN :

978-1-4244-2927-1

Electronic_ISBN :

978-1-4244-2928-8

Type :

conf

DOI :

10.1109/IADCC.2009.4809035

Filename :

4809035

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3073090