• DocumentCode
    2867281
  • Title

    Version history based source code plagiarism detection in proprietary systems

  • Author

    Maskeri, G. ; Karnam, D. ; Viswanathan, S.A. ; Padmanabhuni, Srinivas

  • Author_Institution
    Infosys Labs., Infosys Ltd., Bangalore, India
  • fYear
    2012
  • fDate
    23-28 Sept. 2012
  • Firstpage
    609
  • Lastpage
    612
  • Abstract
    While the advent of open source code search tools have made the source code of thousands of open source software (OSS) readily accessible, thereby increasing legitimate reuse, it has also opened up the possibility of unconscientious employees plagiarizing code from OSS repositories. Plagiarism in proprietary software would not only lead to costly lawsuits, but also undermine the credibility of the organization. Hence detecting plagiarism in proprietary software is an urgent need. Though there exist a number of techniques for detecting plagiarism in student project assignments, they do not scale well in the case of large proprietary software. Especially when code snippets are plagiarized from the large number of available open source software. In this paper we propose a novel approach that applies Mining Software Repositories (MSR) based techniques to the problem of plagiarism detection. We create a programming style profile for each maintenance engineer by mining the version history and use that to detect source code commits that are likely to be plagiarized. Such suspected code fragments can be analyzed using any of the existing plagiarism detection techniques to confirm the plagiarism and ascertain the original code.
  • Keywords
    data mining; organisational aspects; program diagnostics; public domain software; software maintenance; software reusability; source coding; MSR based techniques; OSS repositories; code snippets; legitimate reuse; mining software repositories based techniques; open source code search tools; open source software; organization credibility; programming style profile; proprietary systems; student project assignments; version history based source code plagiarism detection; Cloning; Conferences; Educational institutions; History; Plagiarism; Programming; Software; Author Information; Plagiarism; Version History;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Maintenance (ICSM), 2012 28th IEEE International Conference on
  • Conference_Location
    Trento
  • ISSN
    1063-6773
  • Print_ISBN
    978-1-4673-2313-0
  • Type

    conf

  • DOI
    10.1109/ICSM.2012.6405334
  • Filename
    6405334