DocumentCode
1796663
Title
Distributed evolutionary approach to data clustering and modeling
Author
Hajeer, Mustafa ; Dasgupta, Dipankar ; Semenov, Alexander ; Veijalainen, Jari
Author_Institution
Dept. of Comput. Sci., Univ. of Memphis, Memphis, TN, USA
fYear
2014
fDate
9-12 Dec. 2014
Firstpage
142
Lastpage
148
Abstract
In this article we describe a framework (DEGA-Gen) for the application of distributed genetic algorithms for detection of communities in networks. The framework proposes efficient ways of encoding the network in the chromosomes, greatly optimizing the memory use and computations, resulting in a scalable framework. Different objective functions may be used for producing division of network into communities. The framework is implemented using open source implementation of MapReduce paradigm, Hadoop. We validate the framework by developing community detection algorithm, which uses modularity as measure of the division. Result of the algorithm is the network, partitioned into non-overlapping communities, in such a way, that network modularity is maximized. We apply the algorithm to well-known data sets, such as Zachary Karate club, bottlenose Dolphins network, College football dataset, and US political books dataset. Framework shows comparable results in achieved modularity; however, much less space is used for network representation in memory. Further, the framework is scalable and can deal with large graphs as it was tested on a larger youtube.com dataset.
Keywords
data handling; distributed algorithms; genetic algorithms; parallel processing; pattern clustering; public domain software; College football dataset; Hadoop; MapReduce paradigm; US political books dataset; Zachary Karate club; bottlenose Dolphins network; chromosomes; community detection algorithm; data clustering; data modeling; data sets; distributed evolutionary approach; distributed genetic algorithms; network encoding; network modularity; nonoverlapping communities; open source implementation; Biological cells; Clustering algorithms; Communities; Encoding; Genetic algorithms; Image edge detection; Media; HDFS; MapReduce; analysis; distributed clustering; evolutionary; graph analysis; graph clustering; large graphs; multi objective; social media;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Data Mining (CIDM), 2014 IEEE Symposium on
Conference_Location
Orlando, FL
Type
conf
DOI
10.1109/CIDM.2014.7008660
Filename
7008660
Link To Document