Title :
Parallel classification for data mining on shared-memory multiprocessors
Author :
Zaki, Mohammed J. ; Ho, Ching-Tien ; Agrawal, Rakesh
Author_Institution :
Dept. of Comput. Sci., Rensselaer Polytech. Inst., Troy, NY, USA
Abstract :
Presents parallel algorithms for building decision-tree classifiers on shared-memory multiprocessor (SMP) systems. The proposed algorithms span the gamut of data and task parallelism. The data parallelism is based on attribute scheduling among processors. This basic scheme is extended with task pipelining and dynamic load balancing to yield faster implementations. The task-parallel approach uses dynamic subtree partitioning among processors. Our performance evaluation shows that the construction of a decision-tree classifier can be effectively parallelized on an SMP machine with good speedup
Keywords :
data mining; decision trees; parallel algorithms; pattern classification; pipeline processing; processor scheduling; resource allocation; shared memory systems; software performance evaluation; attribute scheduling; data mining; data parallelism; decision-tree classifiers; dynamic load balancing; dynamic subtree partitioning; implementation speed; parallel algorithms; parallel classification; performance evaluation; shared-memory multiprocessors; speedup; task parallelism; task pipelining; Classification tree analysis; Clouds; Computer science; Data mining; Databases; Decision trees; Load management; Parallel processing; Pipeline processing; Predictive models;
Conference_Titel :
Data Engineering, 1999. Proceedings., 15th International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
0-7695-0071-4
DOI :
10.1109/ICDE.1999.754925