• DocumentCode
    617772
  • Title

    Evaluation of the SHAPD2 algorithm efficiency in plagiarism detection tasks

  • Author

    Ceglarek, Darek

  • Author_Institution
    Dept. of Appl. Inf., Poznan Sch. of Banking, Poznan, Poland
  • fYear
    2013
  • fDate
    9-11 May 2013
  • Firstpage
    465
  • Lastpage
    470
  • Abstract
    This work presents results of the ongoing research in the area of natural language processing focusing on plagiarism detection, applying semantic networks and semantic compression. The results demonstrate that the semantic compression is a valuable addition to the existing methods used in plagiary detection. The application of the semantic compression boosts the efficiency of Sentence Hashing Algorithm for Plagiarism Detection 2 (SHAPD2) and w - shingling algorithm. Experiments were performed on Clough & Stephenson corpus as well as on an available PAN-PC plagiarism corpus used to evaluate plagiarism detection methods, so the results can be compared with other research teams.
  • Keywords
    data compression; natural language processing; semantic networks; text analysis; Clough-&-Stephenson corpus; PAN-PC plagiarism corpus; SHAPD2 algorithm efficiency evaluation; natural language processing; plagiarism detection tasks; semantic compression; semantic networks; sentence hashing algorithm-for-plagiarism detection 2 efficiency evaluation; w-shingling algorithm efficiency evaluation; Abstracts; Benchmark testing; Frequency-domain analysis; Intellectual property; Plagiarism; Semantics; Standards; longest common subsequence; plagiarism detection; semantic compression; sentence hashing; text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE), 2013 International Conference on
  • Conference_Location
    Konya
  • Print_ISBN
    978-1-4673-5612-1
  • Type

    conf

  • DOI
    10.1109/TAEECE.2013.6557319
  • Filename
    6557319