DocumentCode :
1796341
Title :
Comparing datasets by attribute alignment
Author :
Smid, Jakub ; Neruda, Roman
Author_Institution :
Fac. of Math. & Phys., Charles Univ. in Prague, Prague, Czech Republic
fYear :
2014
fDate :
9-12 Dec. 2014
Firstpage :
56
Lastpage :
62
Abstract :
Metalearning approach to the model selection problem - exploiting the idea that algorithms perform similarly on similar datasets - requires a suitable metric on the dataset space. One common approach compares the datasets based on fixed number of features describing the datasets as a whole. The information based on individual attributes is usually aggregated, taken for the most relevant attributes only, or omitted altogether. In this paper, we propose an approach that aligns complete sets of attributes of the datasets, allowing for different number of attributes. By supplying the distance between two attributes, one can find the alignment minimizing the sum of individual distances between aligned attributes. We present two methods that are able to find such an alignment. They differ in computational complexity and presumptions about the distance function between two attributes supplied. Experiments were performed using the proposed methods and the results were compared with the baseline algorithm.
Keywords :
data mining; learning (artificial intelligence); attribute alignment; computational complexity; data mining; metalearning approach; model selection problem; Computational complexity; Data mining; Equations; Feature extraction; Labeling; Machine learning algorithms; Measurement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Data Mining (CIDM), 2014 IEEE Symposium on
Conference_Location :
Orlando, FL
Type :
conf
DOI :
10.1109/CIDM.2014.7008148
Filename :
7008148
Link To Document :
بازگشت