Title :
Evaluation of the SHAPD2 algorithm efficiency in plagiarism detection tasks
Author_Institution :
Dept. of Appl. Inf., Poznan Sch. of Banking, Poznan, Poland
Abstract :
This work presents results of the ongoing research in the area of natural language processing focusing on plagiarism detection, applying semantic networks and semantic compression. The results demonstrate that the semantic compression is a valuable addition to the existing methods used in plagiary detection. The application of the semantic compression boosts the efficiency of Sentence Hashing Algorithm for Plagiarism Detection 2 (SHAPD2) and w - shingling algorithm. Experiments were performed on Clough & Stephenson corpus as well as on an available PAN-PC plagiarism corpus used to evaluate plagiarism detection methods, so the results can be compared with other research teams.
Keywords :
data compression; natural language processing; semantic networks; text analysis; Clough-&-Stephenson corpus; PAN-PC plagiarism corpus; SHAPD2 algorithm efficiency evaluation; natural language processing; plagiarism detection tasks; semantic compression; semantic networks; sentence hashing algorithm-for-plagiarism detection 2 efficiency evaluation; w-shingling algorithm efficiency evaluation; Abstracts; Benchmark testing; Frequency-domain analysis; Intellectual property; Plagiarism; Semantics; Standards; longest common subsequence; plagiarism detection; semantic compression; sentence hashing; text mining;
Conference_Titel :
Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE), 2013 International Conference on
Conference_Location :
Konya
Print_ISBN :
978-1-4673-5612-1
DOI :
10.1109/TAEECE.2013.6557319