DocumentCode :
239104
Title :
A combined MapReduce-windowing two-level parallel scheme for evolutionary prototype generation
Author :
Triguero, Isaac ; Peralta, Daniel ; Bacardit, Jaume ; Garcia, Sergio ; Herrera, Francisco
Author_Institution :
Dept. of Comput. Sci. & Artificial Intell., Univ. of Granada, Granada, Spain
fYear :
2014
fDate :
6-11 July 2014
Firstpage :
3036
Lastpage :
3043
Abstract :
Evolutionary prototype generation techniques have demonstrated their usefulness to improve the capabilities of the nearest neighbor classifier. They act as data reduction algorithms by generating representative points of a given problem. Their main purposes are to speed up the classification process and to reduce the storage requirements and sensitivity to noise of the nearest neighbor rule. Nowadays, with the increment of available data, the use of this kind of reduction techniques becomes more important. However, their applicability can be limited to problems with no more than tens of thousands of instances. In order to address this limitation, in this work we develop a two-level parallelization scheme for evolutionary prototype generation methods. Firstly, it distributes the functioning of these algorithms in several tasks based on a MapReduce framework. Then, for each one of these tasks (mappers), we accelerate the prototype generation process by using a windowing approach. This model enables evolutionary prototype generation algorithms to be applied over large-scale classification problems without accuracy loss. Our preliminary experiments using a dataset of 1 million instances show that this proposal is an appropriate tool to improve the performance of the nearest neighbor classifier with big data.
Keywords :
Big Data; data reduction; parallel processing; pattern classification; Big Data; classification process; combined MapReduce-windowing two-level parallel scheme; data reduction algorithm; evolutionary prototype generation techniques; large-scale classification problems; nearest neighbor classifier; nearest neighbor rule; noise sensitivity; representative point generation; storage requirements reduction; Acceleration; Big data; Computational modeling; Data mining; Prototypes; Runtime; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Evolutionary Computation (CEC), 2014 IEEE Congress on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-6626-4
Type :
conf
DOI :
10.1109/CEC.2014.6900490
Filename :
6900490
Link To Document :
بازگشت