DocumentCode :
1796663
Title :
Distributed evolutionary approach to data clustering and modeling
Author :
Hajeer, Mustafa ; Dasgupta, Dipankar ; Semenov, Alexander ; Veijalainen, Jari
Author_Institution :
Dept. of Comput. Sci., Univ. of Memphis, Memphis, TN, USA
fYear :
2014
fDate :
9-12 Dec. 2014
Firstpage :
142
Lastpage :
148
Abstract :
In this article we describe a framework (DEGA-Gen) for the application of distributed genetic algorithms for detection of communities in networks. The framework proposes efficient ways of encoding the network in the chromosomes, greatly optimizing the memory use and computations, resulting in a scalable framework. Different objective functions may be used for producing division of network into communities. The framework is implemented using open source implementation of MapReduce paradigm, Hadoop. We validate the framework by developing community detection algorithm, which uses modularity as measure of the division. Result of the algorithm is the network, partitioned into non-overlapping communities, in such a way, that network modularity is maximized. We apply the algorithm to well-known data sets, such as Zachary Karate club, bottlenose Dolphins network, College football dataset, and US political books dataset. Framework shows comparable results in achieved modularity; however, much less space is used for network representation in memory. Further, the framework is scalable and can deal with large graphs as it was tested on a larger youtube.com dataset.
Keywords :
data handling; distributed algorithms; genetic algorithms; parallel processing; pattern clustering; public domain software; College football dataset; Hadoop; MapReduce paradigm; US political books dataset; Zachary Karate club; bottlenose Dolphins network; chromosomes; community detection algorithm; data clustering; data modeling; data sets; distributed evolutionary approach; distributed genetic algorithms; network encoding; network modularity; nonoverlapping communities; open source implementation; Biological cells; Clustering algorithms; Communities; Encoding; Genetic algorithms; Image edge detection; Media; HDFS; MapReduce; analysis; distributed clustering; evolutionary; graph analysis; graph clustering; large graphs; multi objective; social media;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Data Mining (CIDM), 2014 IEEE Symposium on
Conference_Location :
Orlando, FL
Type :
conf
DOI :
10.1109/CIDM.2014.7008660
Filename :
7008660
Link To Document :
بازگشت