DocumentCode :
1906904
Title :
Novel Class Detection and Feature via a Tiered Ensemble Approach for Stream Mining
Author :
Parker, Brendon ; Mustafa, Albara M. ; Khan, Latifur
Author_Institution :
Dept. of Comput. Sci., Univ. of Texas at Dallas, Richardson, TX, USA
Volume :
1
fYear :
2012
fDate :
7-9 Nov. 2012
Firstpage :
1171
Lastpage :
1178
Abstract :
Static data mining assumptions with regard to features and labels often fail the streaming context. Features evolve, concepts drift, and novel classes are introduced. Therefore, any classification algorithm that intends to operate on streaming data must have mechanisms to mitigate the obsolescence of classifiers trained early in the stream. This is typically accomplished by either continually updating a monolithic model, or incrementally updating an ensemble. Traditional static data mining algorithms futile in a streaming context (and often in a distributed sensor network) due to their need to iterate over the entire data set locally. Our approach -- named HSMiner (Hierarchical Stream Miner) -- takes a hierarchical decomposition approach to the ensemble classifier concept. By breaking the classification problem into tiers, we can better prune the irrelevant features and counter individual classification error through weighted voting and boosting. In addition, the atomic decomposition of feature inputs enables straightforward mapping to distributing the ensemble among resources in the network. The implementation proves to be fast and very memory conservative, and we emulate a distributed environment via signal-linked threads. We examine the theoretical and empirical analysis of our approach, specifically examining trade-offs of three different novel class detection variations, and compare these results to a similar method using benchmark data sets.
Keywords :
data mining; distributed processing; pattern classification; HSMiner; benchmark data sets; class detection variations; classification algorithm; distributed environment; ensemble classifier concept; feature input atomic decomposition; hierarchical decomposition approach; hierarchical stream miner; monolithic model; signal-linked threads; static data mining assumptions; stream mining; streaming data; tiered ensemble approach; weighted boosting; weighted voting; Accuracy; Algorithm design and analysis; Classification algorithms; Context; Data mining; Heuristic algorithms; Training; concept drift; distributed stream mining; feature evolution; hierarchical ensembles; novel class detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence (ICTAI), 2012 IEEE 24th International Conference on
Conference_Location :
Athens
ISSN :
1082-3409
Print_ISBN :
978-1-4799-0227-9
Type :
conf
DOI :
10.1109/ICTAI.2012.168
Filename :
6495184
Link To Document :
بازگشت