• DocumentCode
    2772044
  • Title

    Efficient Discovery of Confounders in Large Data Sets

  • Author

    Zhou, Wenjun ; Xiong, Hui

  • Author_Institution
    MSIS Dept., Rutgers, State Univ. of New Jersey, Newark, NJ, USA
  • fYear
    2009
  • fDate
    6-9 Dec. 2009
  • Firstpage
    647
  • Lastpage
    656
  • Abstract
    Given a large transaction database, association analysis is concerned with efficiently finding strongly related objects. Unlike traditional associate analysis, where relationships among variables are searched at a global level, we examine confounding factors at a local level. Indeed, many real-world phenomena are localized to specific regions and times. These relationships may not be visible when the entire data set is analyzed. Specially, confounding effects that change the direction of correlation is the most significant. Along this line, we propose to efficiently find confounding effects attributable to local associations. Specifically, we derive an upper bound by a necessary condition of confounders, which can help us prune the search space and efficiently identify confounders. Experimental results show that the proposed CONFOUND algorithm can effectively identify confounders and the computational performance is an order of magnitude faster than benchmark methods.
  • Keywords
    database management systems; transaction processing; CONFOUND algorithm; association analysis; large data sets; search space; transaction database; Bioinformatics; Costs; Data analysis; Data mining; Diseases; Economies of scale; Public healthcare; Transaction databases; USA Councils; Upper bound; Confounder; Correlation; Local Association; Partial Correlation; Phi Correlation coefficient;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
  • Conference_Location
    Miami, FL
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4244-5242-2
  • Electronic_ISBN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2009.77
  • Filename
    5360291