DocumentCode
476209
Title
Plagiarism detection in Chinese based on chunk and paragraph weight
Author
Wang, Tao ; Fan, Xiao-Zhong ; Liu, Jie
Author_Institution
Dept. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing
Volume
5
fYear
2008
fDate
12-15 July 2008
Firstpage
2574
Lastpage
2579
Abstract
Aiming at the Chinese academic paper plagiarism detection, proposed chunk based plagiarism detection algorithm with chunk extraction method based on character or word. Taken account of that different part of document has different importance, proposed two paragraph weight algorithms and defined three paragraph weight functions. The best chunk lengths are determined by experiments. Experiments show that using paragraph weight can enhance the detection effect.
Keywords
natural language processing; text analysis; Chinese language; chunk extraction; paragraph weight; plagiarism detection; Cybernetics; Detection algorithms; Fingers; Information retrieval; Machine learning; Paper technology; Plagiarism; Printing; Probability; Space technology; Paragraph weight; Plagiarism detection; Text chunk;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2008 International Conference on
Conference_Location
Kunming
Print_ISBN
978-1-4244-2095-7
Electronic_ISBN
978-1-4244-2096-4
Type
conf
DOI
10.1109/ICMLC.2008.4620842
Filename
4620842
Link To Document