• DocumentCode
    2507190
  • Title

    Detecting and tracing plagiarized documents by reconstruction plagiarism-evolution tree

  • Author

    Ryu, Chang-Keon ; Kim, Hyong-Jun ; Ji, Seung-Hyun ; Woo, Gyun ; Cho, Hwan-Gue

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Pusan Nat. Univ., Busan
  • fYear
    2008
  • fDate
    8-11 July 2008
  • Firstpage
    119
  • Lastpage
    124
  • Abstract
    Due to smart word processors and powerful Web-searching engines, lots of plagiarism prevail, especially in digital texts. So it is very crucial to develop efficient and effective anti-plagiarism tools to prevent or identify document plagiarism. Till now, a few plagiarism detecting systems have been announced. All previous plagiarism detection studies focus on how to measure the similarity of documents. In this paper, we propose a new approach to reconstruct the evolution process of suspected texts in order to detect plagiarized documents. For this, we propose two major metrics: spatial plagiarism similarity and temporal plagiarism similarity. And by combining these two similarity measure, we give conclusively the evolutionary plagiarism probability model by adopting the Weibull distribution, which is one of extreme distribution used to compute the statistical significance of genomic sequence matching. The main difference of our model to the previous studies is that our model can estimate the plagiarism and its direction as a temporal event. An experiment with a group Internet-posted news clearly coincided to the real plagiarism among those news.
  • Keywords
    Weibull distribution; document handling; search engines; security of data; trees (mathematics); Web-searching engines; Weibull distribution; evolutionary plagiarism probability model; extreme distribution; genomic sequence matching; group Internet-posted news; plagiarized documents; reconstruction plagiarism-evolution tree; similarity measure; smart word processors; spatial plagiarism similarity; temporal event; temporal plagiarism similarity; Computer science; Distributed computing; Genomics; Phylogeny; Plagiarism; Power engineering and energy; Probability; Search engines; Web search; Weibull distribution;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Technology, 2008. CIT 2008. 8th IEEE International Conference on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-1-4244-2357-6
  • Electronic_ISBN
    978-1-4244-2358-3
  • Type

    conf

  • DOI
    10.1109/CIT.2008.4594660
  • Filename
    4594660