DocumentCode :
3487610
Title :
Achieving Linguistic Provenance via Plagiarism Detection
Author :
Idika, Nwokedi ; PHAN, HUY ANH ; Varia, Mayank
Author_Institution :
Lincoln Lab., MIT, Lexington, MA, USA
fYear :
2013
fDate :
25-28 Aug. 2013
Firstpage :
648
Lastpage :
652
Abstract :
To go beyond what current provenance systems can capture for natural language text documents, we propose the Lincoln Laboratory Plagiarism for Provenance System (LLPla) as an approach for capturing linguistic provenance. Linguistic provenance infers the origin of text based on its linguistic structure. We take a plagiarism detection approach to this task as identifying similar sections of text is fundamental to linguistic provenance and central to LLPla Ì´s performance. Thus, to determine the most viable plagiarism detection algorithm for use in LLPla Ì, we evaluate three state-of-the-art plagiarism detection algorithms. Moreover, we propose extensions to the best-performing algorithm that improve its precision with negligible effects on recall.
Keywords :
graph theory; linguistics; natural language processing; text analysis; LLPla approach; Lincoln Laboratory Plagiarism for Provenance System; linguistic provenance; linguistic structure; natural language text documents; plagiarism detection approach; recall effect; text origin; Conferences; Detection algorithms; Generators; Laboratories; Plagiarism; Pragmatics; Probabilistic logic; graphs; plagiarism detection; provenance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
ISSN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2013.133
Filename :
6628698
Link To Document :
بازگشت