Title :
Detecting and tracing plagiarized documents by reconstruction plagiarism-evolution tree
Author :
Ryu, Chang-Keon ; Kim, Hyong-Jun ; Ji, Seung-Hyun ; Woo, Gyun ; Cho, Hwan-Gue
Author_Institution :
Dept. of Comput. Sci. & Eng., Pusan Nat. Univ., Busan
Abstract :
Due to smart word processors and powerful Web-searching engines, lots of plagiarism prevail, especially in digital texts. So it is very crucial to develop efficient and effective anti-plagiarism tools to prevent or identify document plagiarism. Till now, a few plagiarism detecting systems have been announced. All previous plagiarism detection studies focus on how to measure the similarity of documents. In this paper, we propose a new approach to reconstruct the evolution process of suspected texts in order to detect plagiarized documents. For this, we propose two major metrics: spatial plagiarism similarity and temporal plagiarism similarity. And by combining these two similarity measure, we give conclusively the evolutionary plagiarism probability model by adopting the Weibull distribution, which is one of extreme distribution used to compute the statistical significance of genomic sequence matching. The main difference of our model to the previous studies is that our model can estimate the plagiarism and its direction as a temporal event. An experiment with a group Internet-posted news clearly coincided to the real plagiarism among those news.
Keywords :
Weibull distribution; document handling; search engines; security of data; trees (mathematics); Web-searching engines; Weibull distribution; evolutionary plagiarism probability model; extreme distribution; genomic sequence matching; group Internet-posted news; plagiarized documents; reconstruction plagiarism-evolution tree; similarity measure; smart word processors; spatial plagiarism similarity; temporal event; temporal plagiarism similarity; Computer science; Distributed computing; Genomics; Phylogeny; Plagiarism; Power engineering and energy; Probability; Search engines; Web search; Weibull distribution;
Conference_Titel :
Computer and Information Technology, 2008. CIT 2008. 8th IEEE International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4244-2357-6
Electronic_ISBN :
978-1-4244-2358-3
DOI :
10.1109/CIT.2008.4594660