• DocumentCode
    2753767
  • Title

    Developing an Effective Validation Strategy for Genetic Programming Models Based on Multiple Datasets

  • Author

    Liu, Yi ; Khoshgoftaar, Taghi ; Yao, Jenq-Foung

  • Author_Institution
    Georgia Coll. & State Univ., Milledgeville, GA
  • fYear
    2006
  • fDate
    16-18 Sept. 2006
  • Firstpage
    232
  • Lastpage
    237
  • Abstract
    Genetic programming (GP) is a parallel searching technique where many solutions can be obtained simultaneously in the searching process. However, when applied to real-world classification tasks, some of the obtained solutions may have poor predictive performances. One of the reasons is that these solutions only match the shape of the training dataset, failing to learn and generalize the patterns hidden in the dataset. Therefore, unexpected poor results are obtained when the solutions are applied to the test dataset. This paper addresses how to remove the solutions which will have unacceptable performances on the test dataset. The proposed method in this paper applies a multi-dataset validation phase as a filter in GP-based classification tasks. By comparing our proposed method with a standard GP classifier based on the datasets from seven different NASA software projects, we demonstrate that the multi-dataset validation is effective, and can significantly improve the performance of GP-based software quality classification models
  • Keywords
    genetic algorithms; pattern classification; program verification; software quality; NASA software project; genetic programming; model selection; multidataset validation; paired t-tests; software metrics; software quality classification; Filters; Genetic programming; NASA; Pattern matching; Performance evaluation; Shape; Software performance; Software quality; Software standards; Testing; cost misclassification; genetic programming; model selection; multiple datasets; paired t-test; software metrics; software quality classification; validation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Reuse and Integration, 2006 IEEE International Conference on
  • Conference_Location
    Waikoloa Village, HI
  • Print_ISBN
    0-7803-9788-6
  • Type

    conf

  • DOI
    10.1109/IRI.2006.252418
  • Filename
    4018495