DocumentCode :
1102392
Title :
A super-programming approach for mining association rules in parallel on PC clusters
Author :
Jin, Dejiang ; Ziavras, Sotirios G.
Author_Institution :
Dept. of Electr. & Comput. Eng., New Jersey Inst. of Technol., Newark, NJ, USA
Volume :
15
Issue :
9
fYear :
2004
Firstpage :
783
Lastpage :
794
Abstract :
PC clusters have become popular in parallel processing. They do not involve specialized interprocessor networks, so the latency of data communications is rather long. The programming models for PC clusters are often different than those for parallel machines or supercomputers containing sophisticated interprocessor communication networks. For PC clusters, load balancing among the nodes becomes a more critical issue in attempts to yield high performance. We introduce a new model for program development on PC clusters, namely, the super-programming model (SPM). The workload is modeled as a collection of super-instructions (SIs). We propose that a set of SIs be designed for each application domain. They should constitute an orthogonal set of frequently used high-level operations in the corresponding application domain. Each SI should normally be implemented as a high-level language routine that can execute on any PC. Application programs are modeled as super-programs (SPs), which are coded using SIs. SIs are dynamically assigned to available PCs at runtime. Because of the known granularity of SIs, an upper bound on their execution time can be estimated at static time. Therefore, dynamic load balancing becomes an easier task. Our motivation is to support dynamic load balancing and code porting, especially for applications with diverse sets of inputs such as data mining. We apply here SPM to the implementation of an a priori-like algorithm for mining association rules. Our experiments show that the average idle time per node is kept very low.
Keywords :
data mining; parallel algorithms; parallel programming; resource allocation; workstation clusters; PC clusters; a priori-like algorithm; application programs; association rule mining; code porting; data communications; data mining; dynamic load balancing; high-level language; interprocessor communication networks; parallel machines; parallel processing; program development; super-instructions; super-programming model; supercomputers; Association rules; Data communication; Data mining; Delay; Load management; Parallel machines; Parallel processing; Parallel programming; Scanning probe microscopy; Supercomputers; 65; Mining association rules; cluster computing; load balancing; parallel processing.;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/TPDS.2004.37
Filename :
1333650
Link To Document :
بازگشت