• DocumentCode
    3166242
  • Title

    Cross-Mining Binary and Numerical Attributes

  • Author

    Garriga, Gemma C. ; Heikinheimo, Hannes ; Seppänen, Jouni K.

  • Author_Institution
    Helsinki Univ. of Technol., Helsinki
  • fYear
    2007
  • fDate
    28-31 Oct. 2007
  • Firstpage
    481
  • Lastpage
    486
  • Abstract
    We consider the problem of relating itemsets mined on binary attributes of a data set to numerical attributes of the same data. An example is biogeographical data, where the numerical attributes correspond to environmental variables and the binary attributes encode the presence or absence of species in different environments. From the viewpoint of itemset mining, the task is to select a small collection of interesting itemsets using the numerical attributes; from the viewpoint of the numerical attributes, the task is to constrain the search for local patterns (e.g. clusters) using the binary attributes. We give a formal definition of the problem, discuss it theoretically, give a simple constant-factor approximation algorithm, and show by experiments on biogeographical data that the algorithm can capture interesting patterns that would not have been found using either itemset mining or clustering alone.
  • Keywords
    approximation theory; data mining; binary attributes encode; biogeographical data; constant-factor approximation algorithm; cross-mining binary; data set; itemset mining; Approximation algorithms; Bioinformatics; Birds; Clustering algorithms; Data mining; Demography; Information science; Itemsets; Motion pictures; Temperature;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
  • Conference_Location
    Omaha, NE
  • ISSN
    1550-4786
  • Print_ISBN
    978-0-7695-3018-5
  • Type

    conf

  • DOI
    10.1109/ICDM.2007.32
  • Filename
    4470277