DocumentCode
2486055
Title
Distributed randomized algorithms for low-support data mining
Author
Ferro, Alfredo ; Giugno, Rosalba ; Mongiovì, Misael ; Pulvirenti, Alfredo
Author_Institution
Dept. of Math. & Comput. Sci., Univ. of Catania, Catania, Italy
fYear
2009
fDate
23-29 May 2009
Firstpage
1
Lastpage
7
Abstract
Data mining in distributed systems has been facilitated by using high-support association rules. Less attention has been paid to distributed low-support/high-correlation data mining. This has proved useful in several fields such as computational biology, wireless networks, web mining, security and rare events analysis in industrial plants. In this paper we present distributed versions of efficient algorithms for low-support/high-correlation data mining such as Min-Hashing, K-Min-Hashing and Locality-Sensitive-Hashing. Experimental results on real data concerning scalability, speed-up and network traffic are reported.
Keywords
data mining; distributed algorithms; file organisation; randomised algorithms; computational biology; distributed high-correlation data mining; distributed randomized algorithms; high-support association rules; industrial plants; k-min-hashing; locality-sensitive-hashing; low-support data mining; min-hashing; rare events analysis; security; web mining; wireless networks; Association rules; Bioinformatics; Computational biology; Data mining; Matrix decomposition; Partitioning algorithms; Scalability; Transaction databases; Web mining; Workstations;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on
Conference_Location
Rome
ISSN
1530-2075
Print_ISBN
978-1-4244-3751-1
Electronic_ISBN
1530-2075
Type
conf
DOI
10.1109/IPDPS.2009.5161156
Filename
5161156
Link To Document