Title :
Rare Itemset Mining
Author :
Adda, Mehdi ; Wu, Lei ; Feng, Yi
Author_Institution :
Univ. Of Montreal, Montreal
Abstract :
A pattern is a collection of events/features that occur together in a transaction database. Previous studies in the field are often dedicated to the problem of frequent pattern mining where only patterns that appear frequently in the input data are mined. As a result, patterns involving events/features that appear in few data sets are not captured. In some domains, such as the detection of computer attacks, fraudulent transactions in financial institutions, those patterns, also known as rare patterns, are more interesting than frequent patterns. We propose a framework to represent different categories of interesting patterns and then instantiate it to the specific case of rare patterns. Later on, we present a generic framework to mine patterns based on the Apriori approach. In this paper we are interested by the patterns composed of a set of items, also called itemsets. Thus, we instantiate the generalized Apriori framework to mine rare itemsets. The resulting approach is Apriori-like and the mine idea behind it is that if the itemset lattice representing the itemset space in classical Apriori approaches is traversed on a bottom-up manner, equivalent properties to the Apriori exploration of frequent itemsets are provided to mine rare itemsets. This include an anti-monotone property and a level- wise exploration of the itemset space. As demonstrated by our experiments, our approach is effective in identifying all rare itemsets and is more efficient than the existing approach.
Keywords :
data mining; pattern recognition; computer attacks; financial institutions; fraudulent transactions; pattern mining; rare itemset mining; transaction database; Application software; Computer science; Data mining; Data security; Databases; Frequency; Itemsets; Machine learning; Operations research; Software engineering;
Conference_Titel :
Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on
Conference_Location :
Cincinnati, OH
Print_ISBN :
978-0-7695-3069-7
DOI :
10.1109/ICMLA.2007.106