• DocumentCode
    117235
  • Title

    Efficient approaches to interleaved sampling of training data for symbolic regression

  • Author

    Muhammad Atif Azad, R. ; Medernach, David ; Ryan, Colan

  • Author_Institution
    CSIS Dept., Univ. of Limerick, Limerick, Ireland
  • fYear
    2014
  • fDate
    July 30 2014-Aug. 1 2014
  • Firstpage
    176
  • Lastpage
    183
  • Abstract
    The ability to generalize beyond the training set is paramount for any machine learning algorithm and Genetic Programming (GP) is no exception. This paper investigates a recently proposed technique to improve generalisation in GP, termed Interleaved Sampling where GP alternates between using the entire data set and only a single data point in alternate generations. This paper proposes two alternatives to using a single data point: the use of random search instead of a single data point, and simply minimising the tree size. Both the approaches are more efficient than the original Interleaved Sampling because they simply do not evaluate the fitness in half the number of generations. The results show that in terms of generalisation, random search and size minimisation are as effective as the original Interleaved Sampling; however, they are computationally more efficient in terms of data processing. Size minimisation is particularly interesting because it completely prevents bloat while still being competitive in terms of training results as well as generalisation. The tree sizes with size minimisation are substantially smaller reducing the computational expense substantially.
  • Keywords
    genetic algorithms; regression analysis; sampling methods; trees (mathematics); GP; data processing; generalisation; genetic programming; interleaved sampling; machine learning algorithm; random search; symbolic regression; tree size minimisation; tree sizes; Biological system modeling; Boats; Concrete; Genetics; Programming; Sociology; Statistics; Genetic Programming; optimisation; over fitting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Nature and Biologically Inspired Computing (NaBIC), 2014 Sixth World Congress on
  • Conference_Location
    Porto
  • Print_ISBN
    978-1-4799-5936-5
  • Type

    conf

  • DOI
    10.1109/NaBIC.2014.6921874
  • Filename
    6921874