Title :
Discovering text reuse in large collections of documents: A study of theses in history sciences
Author :
Anton S. Khritankov;Pavel V. Botov;Nikolay S. Surovenko;Sergey V. Tsarkov;Dmitriy V. Viuchnov;Yuri V. Chekhovich
Author_Institution :
Anti-Plagiat JSC, Moscow, Russia
Abstract :
In this paper we investigate graphs of text reuse cases in scientific degree theses in history sciences (07.xx.xx of Russian Higher Attestation Committee topic codes). Using algorithmic and statistical methods we discovered groups of highly connected theses with large amount of text reuse between them. In addition we located works compiled from several other theses and point out sources of reuse.
Conference_Titel :
Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT), 2015
DOI :
10.1109/AINL-ISMW-FRUCT.2015.7382965