• DocumentCode
    144888
  • Title

    An improved method for tree-based clone detection in Web Applications

  • Author

    Chaoqun Li ; Jianhua Sun ; Hao Chen

  • Author_Institution
    Sch. of Inf. Sci. & Eng., Hunan Univ., Changsha, China
  • fYear
    2014
  • fDate
    6-8 May 2014
  • Firstpage
    363
  • Lastpage
    367
  • Abstract
    Clone detection has been an active area for decades and many tools have been proposed. Existing researches show that in traditional software clones achieve to 13%-20%, and the clone rate in Web Application area may be higher. In this paper, we propose an improved method for code clone detection. Our approach is based on the randomized kd-trees with dimensionality reduction to cluster the characteristic vectors. We have implemented our algorithm and tested it on large software projects written in Java and PHP including JDK and 7 Web applications. Our experimental results show that our tool is efficiency both in speed and accuracy. In addition, we conducted empirical studies on Web applications and find that the cloned rates achieve to 5-82% in PHP Web Applications.
  • Keywords
    Internet; Java; data mining; software management; trees (mathematics); JDK; Java; PHP; Web applications; characteristic vector; code clone detection; dimensionality reduction; randomized kd-trees; software clones; software projects; tree-based clone detection; Accuracy; Cloning; Clustering algorithms; Indexes; Software; Vectors; Vegetation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Information and Communication Technology and it's Applications (DICTAP), 2014 Fourth International Conference on
  • Conference_Location
    Bangkok
  • Print_ISBN
    978-1-4799-3723-3
  • Type

    conf

  • DOI
    10.1109/DICTAP.2014.6821712
  • Filename
    6821712