DocumentCode
3166242
Title
Cross-Mining Binary and Numerical Attributes
Author
Garriga, Gemma C. ; Heikinheimo, Hannes ; Seppänen, Jouni K.
Author_Institution
Helsinki Univ. of Technol., Helsinki
fYear
2007
fDate
28-31 Oct. 2007
Firstpage
481
Lastpage
486
Abstract
We consider the problem of relating itemsets mined on binary attributes of a data set to numerical attributes of the same data. An example is biogeographical data, where the numerical attributes correspond to environmental variables and the binary attributes encode the presence or absence of species in different environments. From the viewpoint of itemset mining, the task is to select a small collection of interesting itemsets using the numerical attributes; from the viewpoint of the numerical attributes, the task is to constrain the search for local patterns (e.g. clusters) using the binary attributes. We give a formal definition of the problem, discuss it theoretically, give a simple constant-factor approximation algorithm, and show by experiments on biogeographical data that the algorithm can capture interesting patterns that would not have been found using either itemset mining or clustering alone.
Keywords
approximation theory; data mining; binary attributes encode; biogeographical data; constant-factor approximation algorithm; cross-mining binary; data set; itemset mining; Approximation algorithms; Bioinformatics; Birds; Clustering algorithms; Data mining; Demography; Information science; Itemsets; Motion pictures; Temperature;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location
Omaha, NE
ISSN
1550-4786
Print_ISBN
978-0-7695-3018-5
Type
conf
DOI
10.1109/ICDM.2007.32
Filename
4470277
Link To Document