Title :
Mining Prevalence-Based Ratio Patterns
Author :
Zhang, Minghua ; Hsu, Wynne ; Lee, Mong Li
Author_Institution :
Nat. Univ. of Singapore, Singapore
Abstract :
Association rule mining aims to discover sets of features that occur together. A variation of association rule mining is ratio rule mining. A ratio rule is an eigenvector of the database that describes ratios of features. However, ratio rules are sensitive to outliers. In this work, we design a prevalence-based model for mining ratio patterns from a database. Our model is more robust to noises, and ratio patterns in our model have clear statistic meanings. We develop an algorithm to quickly determine the sets of features and their ratios that satisfy the prevalence requirement. Data structures, such as hash table and hash tree are utilized to further improve the efficiency of the algorithm. Experiments on synthetic data indicates the efficiency and scalability of the proposed algorithm. We also present a case study on US census data.
Keywords :
data mining; tree data structures; US census data; association rule mining; data structures; hash table; hash tree; prevalence-based ratio patterns mining; Association rules; Data mining; Educational institutions; Marketing and sales; Noise robustness; Remuneration; Signal to noise ratio; Spatial databases; Transaction databases; Tree data structures;
Conference_Titel :
Tools with Artificial Intelligence, 2007. ICTAI 2007. 19th IEEE International Conference on
Conference_Location :
Patras
Print_ISBN :
978-0-7695-3015-4
DOI :
10.1109/ICTAI.2007.95