DocumentCode
2207454
Title
PGLCM: Efficient Parallel Mining of Closed Frequent Gradual Itemsets
Author
Do, Trong Dinh Thac ; Laurent, Anne ; Termier, Alexandre
Author_Institution
CNRS, Grenoble Univ., Grenoble, France
fYear
2010
fDate
13-17 Dec. 2010
Firstpage
138
Lastpage
147
Abstract
Numerical data (e.g., DNA micro-array data, sensor data) pose a challenging problem to existing frequent pattern mining methods which hardly handle them. In this framework, gradual patterns have been recently proposed to extract covariations of attributes, such as: "When X increases, Y decreases". There exist some algorithms for mining frequent gradual patterns, but they cannot scale to real-world databases. We present in this paper GLCM, the first algorithm for mining closed frequent gradual patterns, which proposes strong complexity guarantees: the mining time is linear with the number of closed frequent gradual item sets. Our experimental study shows that GLCM is two orders of magnitude faster than the state of the art, with a constant low memory usage. We also present PGLCM, a parallelization of GLCM capable of exploiting multicore processors, with good scale-up properties on complex datasets. These algorithms are the first algorithms capable of mining large real world datasets to discover gradual patterns.
Keywords
data mining; multiprocessing systems; set theory; closed frequent gradual itemset; linear mining time; multicore processor; numerical data; pattern mining; Data mining; frequent pattern mining; gradual itemsets; parallelism;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining (ICDM), 2010 IEEE 10th International Conference on
Conference_Location
Sydney, NSW
ISSN
1550-4786
Print_ISBN
978-1-4244-9131-5
Electronic_ISBN
1550-4786
Type
conf
DOI
10.1109/ICDM.2010.101
Filename
5693967
Link To Document