مرکز منطقه ای اطلاع رساني علوم و فناوري - Evolving Big Data Stream Classification with MapReduce

DocumentCode :

172926

Title :

Evolving Big Data Stream Classification with MapReduce

Author :

Haque, Ashraful ; Parker, Brendon ; Khan, Latifur ; Thuraisingham, Bhavani

Author_Institution :

Dept. of Comput. Sci., Univ. of Texas at Dallas, Richardson, TX, USA

fYear :

2014

fDate :

June 27 2014-July 2 2014

Firstpage :

570

Lastpage :

577

Abstract :

Big Data Stream mining has some inherent challenges which are not present in traditional data mining. Not only Big Data Stream receives large volume of data continuously, but also it may have different types of features. Moreover, the concepts and features tend to evolve throughout the stream. Traditional data mining techniques are not sufficient to address these challenges. In our current work, we have designed a multi-tiered ensemble based method HSMiner to address aforementioned challenges to label instances in an evolving Big Data Stream. However, this method requires building large number of AdaBoost ensembles for each of the numeric features after receiving each new data chunk which is very costly. Thus, HSMiner may face scalability issue in case of classifying Big Data Stream. To address this problem, we propose three approaches to build these large number of AdaBoost ensembles using MapReduce based parallelism. We compare each of these approaches from different aspects of design. We also empirically show that, these approaches are very useful for our base method to achieve significant scalability and speedup.

Keywords :

Big Data; data mining; learning (artificial intelligence); pattern classification; AdaBoost; Big Data; HSMiner; MapReduce; data mining; multitiered ensemble based method; stream classification; stream mining; Big data; Data mining; Distributed databases; Indexes; Parallel processing; Scalability; Sorting; Distributed Processing; Evolving Big Data Stream; MapReduce; Scalability;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Cloud Computing (CLOUD), 2014 IEEE 7th International Conference on

Conference_Location :

Anchorage, AK

Print_ISBN :

978-1-4799-5062-1

Type :

conf

DOI :

10.1109/CLOUD.2014.82

Filename :

6973788

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=172926