DocumentCode
2369677
Title
A high-performance distributed algorithm for mining association rules
Author
Schuster, Assaf ; Wolff, Ran ; Trock, Dan
Author_Institution
Technion-Israel Inst. of Technol., Haifa, Israel
fYear
2003
fDate
19-22 Nov. 2003
Firstpage
291
Lastpage
298
Abstract
We present a new distributed association rule mining (D-ARM) algorithm that demonstrates superlinear speedup with the number of computing nodes. The algorithm is the first D-ARM algorithm to perform a single scan over the database. As such, its performance is unmatched by any previous algorithm. Scale-up experiments over standard synthetic benchmarks demonstrate stable run time regardless of the number of computers. Theoretical analysis reveals a tighter bound on error probability than the one shown in the corresponding sequential algorithm.
Keywords
data mining; distributed algorithms; error statistics; very large databases; D-ARM algorithm; distributed association rule mining; error probability; high-performance distributed algorithm; scale-up experiment; sequential algorithm; Association rules; Clustering algorithms; Costs; Data mining; Distributed algorithms; Distributed databases; Itemsets; Partitioning algorithms; Sampling methods; Transaction databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
Print_ISBN
0-7695-1978-4
Type
conf
DOI
10.1109/ICDM.2003.1250932
Filename
1250932
Link To Document