• DocumentCode
    3653296
  • Title

    Detecting and modeling local text reuse

  • Author

    David A. Smith;Ryan Cordel;Elizabeth Maddock Dillon;Nick Stramp;John Wilkerson

  • Author_Institution
    College of Computer and Information Science, Northeastern University, Boston, MA, U.S.A.
  • fYear
    2014
  • Firstpage
    183
  • Lastpage
    192
  • Abstract
    Texts propagate through many social networks and provide evidence for their structure. We describe and evaluate efficient algorithms for detecting clusters of reused passages embedded within longer documents in large collections. We apply these techniques to two case studies: analyzing the culture of free reprinting in the nineteenth-century United States and the development of bills into legislation in the U.S. Congress. Using these divergent case studies, we evaluate both the efficiency of the approximate local text reuse detection methods and the accuracy of the results. These techniques allow us to explore how ideas spread, which ideas spread, and which subgroups shared ideas.
  • Keywords
    "Abstracts","Logic gates","Irrigation"
  • Publisher
    ieee
  • Conference_Titel
    Digital Libraries (JCDL), 2014 IEEE/ACM Joint Conference on
  • Type

    conf

  • DOI
    10.1109/JCDL.2014.6970166
  • Filename
    6970166