DocumentCode :
2676153
Title :
Chinese semantic knowledge representation and overlap measure for Chinese documents
Author :
Li, Xu ; Yu, Xiaoqiang ; Yao, Chunlong ; Zhao, Xiuyan
Author_Institution :
Inf. Sci. & Eng. Coll., Dalian Ploytechnic Univ., Dalian, China
fYear :
2012
fDate :
15-17 July 2012
Firstpage :
545
Lastpage :
550
Abstract :
Document copy detection is to judge whether a given query document plagiarizes content of other ones in the database, which plagiarism occurs in some ways, such as by duplicating partial or total document content, by using different words or sentences to express the same meanings of the text of previous documents. Matching hashed chunks is relatively simple and suffices for reliably detecting exact overlaps. However, detecting paraphrase overlap is subtle. To address the problem, a frame-based Chinese semantic knowledge representation and an overlap measure method for Chinese documents are proposed. The experimental results show that the method can identify the complicated plagiarism patterns, such as single-word synonym, voice changes, part of speech changes and breaking long sentence.
Keywords :
knowledge representation; natural language processing; query processing; text analysis; Chinese documents; breaking long sentence; complicated plagiarism pattern identification; document copy detection; frame-based Chinese semantic knowledge representation; hashed chunk matching; overlap measure method; paraphrase overlap detection; part of speech changes; query document plagiarize content; single-word synonym; voice changes; Compounds; Databases; Knowledge representation; Plagiarism; Semantics; Speech; Syntactics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Control and Information Processing (ICICIP), 2012 Third International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4577-2144-1
Type :
conf
DOI :
10.1109/ICICIP.2012.6391442
Filename :
6391442
Link To Document :
بازگشت