• DocumentCode
    3175950
  • Title

    Using semi-supervised clustering to improve regression test selection techniques

  • Author

    Chen, Songyu ; Chen, Zhenyu ; Zhao, Zhihong ; Xu, Baowen ; Feng, Yang

  • Author_Institution
    State Key Lab. for Novel Software Technol., Nanjing Univ., Nanjing, China
  • fYear
    2011
  • fDate
    21-25 March 2011
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    Cluster test selection is proposed as an efficient regression testing approach. It uses some distance measures and clustering algorithms to group tests into some clusters. Tests in a same cluster are considered to have similar behaviors. A certain sampling strategy for the clustering result is used to build up a small subset of tests, which is expected to approximate the fault detection capability of the original test set. All existing cluster test selection methods employ unsupervised clustering. The previous test results are not used in the process of clustering. It may lead to unsatisfactory clustering results in some cases. In this paper, a semi-supervised clustering method, namely semi-supervised K-means (SSKM), is introduced to improve cluster test selection. SSKM uses limited supervision in the form of pair wise constraints: Must-link and Cannot-link. These pair wise constraints are derived from previous test results to improve clustering results as well as test selection results. The experiment results illustrate the effectiveness of cluster test selection methods with SSKM. Two useful observations are made by analysis. (1) Cluster test selection with SSKM has a better effectiveness when the failed tests are in a medium proportion. (2) A strict definition of pair wise constraint can improve the effectiveness of cluster test selection with SSKM.
  • Keywords
    pattern clustering; regression analysis; statistical testing; cannot-link; cluster test selection; clustering algorithm; fault detection; must-link; pair wise constraint; regression test selection technique; regression testing; semisupervised K-means; semisupervised clustering; unsupervised clustering; Clustering methods; Euclidean distance; Flexible printed circuits; Machine learning; Software testing; K-means; Test selection; pairwise constraint; regression testing; semi-supervised clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Testing, Verification and Validation (ICST), 2011 IEEE Fourth International Conference on
  • Conference_Location
    Berlin
  • Print_ISBN
    978-1-61284-174-8
  • Electronic_ISBN
    978-0-7695-4342-0
  • Type

    conf

  • DOI
    10.1109/ICST.2011.38
  • Filename
    5770589