• DocumentCode
    3739334
  • Title

    Selecting Machine Learning Algorithms Using Regression Models

  • Author

    Tri Doan;Jugal Kalita

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Colorado, Colorado Springs, CO, USA
  • fYear
    2015
  • Firstpage
    1498
  • Lastpage
    1505
  • Abstract
    In performing data mining, a common task is to search for the most appropriate algorithm(s) to retrieve important information from data. With an increasing number of available data mining techniques, it may be impractical to experiment with many techniques on a specific dataset of interest to find the best algorithm(s). In this paper, we demonstrate the suitability of tree-based multi-variable linear regression in predicting algorithm performance. We take into account prior machine learning experience to construct meta-knowledge for supervised learning. The idea is to use summary knowledge about datasets along with past performance of algorithms on these datasets to build this meta-knowledge. We augment pure statistical summaries with descriptive features and a misclassification cost, and discover that transformed datasets obtained by reducing a high dimensional feature space to a smaller dimension still retain significant characteristic knowledge necessary to predict algorithm performance. Our approach works well for both numerical and nominal data obtained from real world environments.
  • Keywords
    "Prediction algorithms","Machine learning algorithms","Measurement","Training","Data mining","Error analysis","Regression tree analysis"
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshop (ICDMW), 2015 IEEE International Conference on
  • Electronic_ISBN
    2375-9259
  • Type

    conf

  • DOI
    10.1109/ICDMW.2015.43
  • Filename
    7395848