DocumentCode
2507190
Title
Detecting and tracing plagiarized documents by reconstruction plagiarism-evolution tree
Author
Ryu, Chang-Keon ; Kim, Hyong-Jun ; Ji, Seung-Hyun ; Woo, Gyun ; Cho, Hwan-Gue
Author_Institution
Dept. of Comput. Sci. & Eng., Pusan Nat. Univ., Busan
fYear
2008
fDate
8-11 July 2008
Firstpage
119
Lastpage
124
Abstract
Due to smart word processors and powerful Web-searching engines, lots of plagiarism prevail, especially in digital texts. So it is very crucial to develop efficient and effective anti-plagiarism tools to prevent or identify document plagiarism. Till now, a few plagiarism detecting systems have been announced. All previous plagiarism detection studies focus on how to measure the similarity of documents. In this paper, we propose a new approach to reconstruct the evolution process of suspected texts in order to detect plagiarized documents. For this, we propose two major metrics: spatial plagiarism similarity and temporal plagiarism similarity. And by combining these two similarity measure, we give conclusively the evolutionary plagiarism probability model by adopting the Weibull distribution, which is one of extreme distribution used to compute the statistical significance of genomic sequence matching. The main difference of our model to the previous studies is that our model can estimate the plagiarism and its direction as a temporal event. An experiment with a group Internet-posted news clearly coincided to the real plagiarism among those news.
Keywords
Weibull distribution; document handling; search engines; security of data; trees (mathematics); Web-searching engines; Weibull distribution; evolutionary plagiarism probability model; extreme distribution; genomic sequence matching; group Internet-posted news; plagiarized documents; reconstruction plagiarism-evolution tree; similarity measure; smart word processors; spatial plagiarism similarity; temporal event; temporal plagiarism similarity; Computer science; Distributed computing; Genomics; Phylogeny; Plagiarism; Power engineering and energy; Probability; Search engines; Web search; Weibull distribution;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer and Information Technology, 2008. CIT 2008. 8th IEEE International Conference on
Conference_Location
Sydney, NSW
Print_ISBN
978-1-4244-2357-6
Electronic_ISBN
978-1-4244-2358-3
Type
conf
DOI
10.1109/CIT.2008.4594660
Filename
4594660
Link To Document