DocumentCode
617772
Title
Evaluation of the SHAPD2 algorithm efficiency in plagiarism detection tasks
Author
Ceglarek, Darek
Author_Institution
Dept. of Appl. Inf., Poznan Sch. of Banking, Poznan, Poland
fYear
2013
fDate
9-11 May 2013
Firstpage
465
Lastpage
470
Abstract
This work presents results of the ongoing research in the area of natural language processing focusing on plagiarism detection, applying semantic networks and semantic compression. The results demonstrate that the semantic compression is a valuable addition to the existing methods used in plagiary detection. The application of the semantic compression boosts the efficiency of Sentence Hashing Algorithm for Plagiarism Detection 2 (SHAPD2) and w - shingling algorithm. Experiments were performed on Clough & Stephenson corpus as well as on an available PAN-PC plagiarism corpus used to evaluate plagiarism detection methods, so the results can be compared with other research teams.
Keywords
data compression; natural language processing; semantic networks; text analysis; Clough-&-Stephenson corpus; PAN-PC plagiarism corpus; SHAPD2 algorithm efficiency evaluation; natural language processing; plagiarism detection tasks; semantic compression; semantic networks; sentence hashing algorithm-for-plagiarism detection 2 efficiency evaluation; w-shingling algorithm efficiency evaluation; Abstracts; Benchmark testing; Frequency-domain analysis; Intellectual property; Plagiarism; Semantics; Standards; longest common subsequence; plagiarism detection; semantic compression; sentence hashing; text mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE), 2013 International Conference on
Conference_Location
Konya
Print_ISBN
978-1-4673-5612-1
Type
conf
DOI
10.1109/TAEECE.2013.6557319
Filename
6557319
Link To Document