DocumentCode :
2772044
Title :
Efficient Discovery of Confounders in Large Data Sets
Author :
Zhou, Wenjun ; Xiong, Hui
Author_Institution :
MSIS Dept., Rutgers, State Univ. of New Jersey, Newark, NJ, USA
fYear :
2009
fDate :
6-9 Dec. 2009
Firstpage :
647
Lastpage :
656
Abstract :
Given a large transaction database, association analysis is concerned with efficiently finding strongly related objects. Unlike traditional associate analysis, where relationships among variables are searched at a global level, we examine confounding factors at a local level. Indeed, many real-world phenomena are localized to specific regions and times. These relationships may not be visible when the entire data set is analyzed. Specially, confounding effects that change the direction of correlation is the most significant. Along this line, we propose to efficiently find confounding effects attributable to local associations. Specifically, we derive an upper bound by a necessary condition of confounders, which can help us prune the search space and efficiently identify confounders. Experimental results show that the proposed CONFOUND algorithm can effectively identify confounders and the computational performance is an order of magnitude faster than benchmark methods.
Keywords :
database management systems; transaction processing; CONFOUND algorithm; association analysis; large data sets; search space; transaction database; Bioinformatics; Costs; Data analysis; Data mining; Diseases; Economies of scale; Public healthcare; Transaction databases; USA Councils; Upper bound; Confounder; Correlation; Local Association; Partial Correlation; Phi Correlation coefficient;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
Conference_Location :
Miami, FL
ISSN :
1550-4786
Print_ISBN :
978-1-4244-5242-2
Electronic_ISBN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2009.77
Filename :
5360291
Link To Document :
بازگشت