DocumentCode :
2709221
Title :
Efficient Discovery of Statistically Significant Association Rules
Author :
Hamalainen, W. ; Nykanen, M.
Author_Institution :
Dept. of Comput. Sci., Univ. of Helsinki, Helsinki
fYear :
2008
fDate :
15-19 Dec. 2008
Firstpage :
203
Lastpage :
212
Abstract :
Searching statistically significant association rules is an important but neglected problem. Traditional association rules do not capture the idea of statistical dependence and the resulting rules can be spurious, while the most significant rules may be missing. This leads to erroneous models and predictions which often become expensive.The problem is computationally very difficult, because the significance is not a monotonic property. However, in this paper we prove several other properties, which can be used for pruning the search space. The properties are implemented in the StatApriori algorithm, which searches statistically significant, non-redundant association rules. Based on both theoretical and empirical observations, the resulting rules are very accurate compared to traditional association rules. In addition, StatApriori can work with extremely low frequencies, thus finding new interesting rules.
Keywords :
data mining; StatApriori algorithm; efficient statistically significant association rule discovery; nonredundant association rules; statistically significant association rule searching; Association rules; Clustering algorithms; Cost function; Data analysis; Data mining; Lagrangian functions; Linear discriminant analysis; Support vector machine classification; Support vector machines; Unsupervised learning; StatApriori algorithm; association rule; statistical significance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location :
Pisa
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3502-9
Type :
conf
DOI :
10.1109/ICDM.2008.144
Filename :
4781115
Link To Document :
بازگشت