DocumentCode :
3739334
Title :
Selecting Machine Learning Algorithms Using Regression Models
Author :
Tri Doan;Jugal Kalita
Author_Institution :
Dept. of Comput. Sci., Univ. of Colorado, Colorado Springs, CO, USA
fYear :
2015
Firstpage :
1498
Lastpage :
1505
Abstract :
In performing data mining, a common task is to search for the most appropriate algorithm(s) to retrieve important information from data. With an increasing number of available data mining techniques, it may be impractical to experiment with many techniques on a specific dataset of interest to find the best algorithm(s). In this paper, we demonstrate the suitability of tree-based multi-variable linear regression in predicting algorithm performance. We take into account prior machine learning experience to construct meta-knowledge for supervised learning. The idea is to use summary knowledge about datasets along with past performance of algorithms on these datasets to build this meta-knowledge. We augment pure statistical summaries with descriptive features and a misclassification cost, and discover that transformed datasets obtained by reducing a high dimensional feature space to a smaller dimension still retain significant characteristic knowledge necessary to predict algorithm performance. Our approach works well for both numerical and nominal data obtained from real world environments.
Keywords :
"Prediction algorithms","Machine learning algorithms","Measurement","Training","Data mining","Error analysis","Regression tree analysis"
Publisher :
ieee
Conference_Titel :
Data Mining Workshop (ICDMW), 2015 IEEE International Conference on
Electronic_ISBN :
2375-9259
Type :
conf
DOI :
10.1109/ICDMW.2015.43
Filename :
7395848
Link To Document :
بازگشت