• DocumentCode
    2334637
  • Title

    The computational complexity of high-dimensional correlation search

  • Author

    Jermaine, Christopher

  • Author_Institution
    Coll. of Comput., Georgia Inst. of Technol., GA, USA
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    249
  • Lastpage
    256
  • Abstract
    There is a growing awareness that the popular support metric (often used to guide search in market-basket analysis) is not appropriate for use in every association mining application. Support measures only the co-occurrence frequency of a set of events when determining which patterns to report back to the user. It incorporates no rigorous statistical notion of surprise or interest, and many of the patterns deemed interesting by the support metric are uninteresting to the user. However, a positive aspect of support is that search using support is very efficient. The question addresses in the paper is: can we retain this efficiency if we move beyond support, and to other more rigorous metrics? We consider the computational implications of incorporating simple expectation into the data mining task. It turns out that many variations on the problem which incorporate more rigorous tests of dependence (or independence) result in NP-hard problem definitions
  • Keywords
    associative processing; computational complexity; data mining; search problems; NP-hard problem definitions; association mining application; co-occurrence frequency; computational complexity; computational implications; data mining task; high-dimensional correlation search; market-basket analysis searching; rigorous metrics; simple expectation; support metric; Appropriate technology; Computational complexity; Data mining; Educational institutions; Frequency measurement; NP-hard problem; Probability; Statistical analysis; Statistical distributions; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
  • Conference_Location
    San Jose, CA
  • Print_ISBN
    0-7695-1119-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2001.989526
  • Filename
    989526