DocumentCode :
2748611
Title :
Parallel out-of-core divide-and-conquer techniques with application to classification trees
Author :
Sreenivas, Mahesh K. ; Alsabti, Khaled ; Ranka, Sanjay
Author_Institution :
Dept. of CISE, Florida Univ., Gainesville, FL, USA
fYear :
1999
fDate :
12-16 Apr 1999
Firstpage :
555
Lastpage :
562
Abstract :
Classification is an important problem in the field of data mining. Construction of good classifiers is computationally intensive and offers plenty of scope for parallelization. Divide-and-conquer paradigm can be used to efficiently construct decision tree classifiers. We discuss in detail various techniques for parallel divide-and-conquer and extend these techniques to handle efficiently disk-resident data. Furthermore, a generic technique for parallel out-of-core divide-and-conquer problems is suggested. We present pCLOUDS, the parallel version of the decision tree classifier algorithm CLOUDS, capable of handling large out-of-core data sets. pCLOUDS exhibits excellent speedup, sizeup and scaleup properties which make it a competitive tool for data mining applications. We evaluate the performance of pCLOUDS for a range of synthetic data sets on the IBM-SP2
Keywords :
classification; data mining; decision trees; divide and conquer methods; parallel algorithms; software performance evaluation; classifiers; data mining; decision tree classifiers; disk-resident data; divide-and-conquer; pCLOUDS; parallel divide-and-conquer; performance; Classification tree analysis; Clouds; Concatenated codes; Data mining; Decision trees; Degradation; Delay; Read only memory; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing, 1999. 13th International and 10th Symposium on Parallel and Distributed Processing, 1999. 1999 IPPS/SPDP. Proceedings
Conference_Location :
San Juan
Print_ISBN :
0-7695-0143-5
Type :
conf
DOI :
10.1109/IPPS.1999.760532
Filename :
760532
Link To Document :
بازگشت