• DocumentCode
    2359830
  • Title

    On finding duplication and near-duplication in large software systems

  • Author

    Baker, Brenda S.

  • Author_Institution
    AT&T Bell Labs., Murray Hill, NJ, USA
  • fYear
    1995
  • fDate
    14-16 Jul 1995
  • Firstpage
    86
  • Lastpage
    95
  • Abstract
    This paper describes how a program called dup can be used to locate instances of duplication or near-duplication in a software system. Dup reports both textually identical sections of code and sections that are the same textually except for systematic substitution of one set of variable names and constants for another. Further processing locates longer sections of code that are the same except for other small modifications. Experimental results from running dup on millions of lines from two large software systems show dup to be both effective at locating duplication and fast. Applications could include identifying sections of code that should be replaced by procedures, elimination of duplication during reengineering of the system, redocumentation to include references to copies, and debugging
  • Keywords
    program debugging; software tools; system documentation; systems analysis; systems re-engineering; constants; debugging; dup; experimental results; large software systems; redocumentation; software duplication; system reengineering; systematic substitution; variable names; Application software; Computer bugs; Programming profession; Reverse engineering; Scattering parameters; Sections; Software systems; Terminology; White spaces;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reverse Engineering, 1995., Proceedings of 2nd Working Conference on
  • Conference_Location
    Toronto, Ont.
  • Print_ISBN
    0-8186-711-43
  • Type

    conf

  • DOI
    10.1109/WCRE.1995.514697
  • Filename
    514697