Title :
An equivalence-class-based algorithm for the maximal number of candidate itemsets
Author :
Wang, Yun-Lan ; Li, Zeng-Zhi ; Qu, Ke-Wen
Author_Institution :
Dept. of Comput. Sci. & Technol., Xi´´an Jiaotong Univ., China
Abstract :
Mining association rules is one of the most important problems in the field of data mining. To device an algorithm that can reduce the number of database scans and without taking the risk of getting a combinatorial explosion of the number of candidate itemsets, we must study what is the maximal number of candidate itemsets that can be generated. In this paper, the theory of itemset equivalence class is proposed. The property of itemset equivalence class is explored and some useful lemmas are presented. Based on the foregoing theory and a priori property, we derive some theorems about the maximal number of candidate itemsets. Based on these theorems, we device an algorithm EC for calculating the maximal number of candidate itemsets, Furthermore, the performance study shows that algorithm EC is more accurate than algorithm KK and the cost for computing the maximal number of negligible compared to the cost of the complete algorithm for association rules.
Keywords :
data mining; database theory; set theory; candidate itemsets; data mining; database scans; equivalence-class-based algorithm; itemset equivalence class; mining association rules; Association rules; Computer science; Costs; Data mining; Explosions; Itemsets; Partitioning algorithms; Switches; Transaction databases; Upper bound;
Conference_Titel :
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN :
0-7803-8131-9
DOI :
10.1109/ICMLC.2003.1264485