Title :
Chinese semantic knowledge representation and overlap measure for Chinese documents
Author :
Li, Xu ; Yu, Xiaoqiang ; Yao, Chunlong ; Zhao, Xiuyan
Author_Institution :
Inf. Sci. & Eng. Coll., Dalian Ploytechnic Univ., Dalian, China
Abstract :
Document copy detection is to judge whether a given query document plagiarizes content of other ones in the database, which plagiarism occurs in some ways, such as by duplicating partial or total document content, by using different words or sentences to express the same meanings of the text of previous documents. Matching hashed chunks is relatively simple and suffices for reliably detecting exact overlaps. However, detecting paraphrase overlap is subtle. To address the problem, a frame-based Chinese semantic knowledge representation and an overlap measure method for Chinese documents are proposed. The experimental results show that the method can identify the complicated plagiarism patterns, such as single-word synonym, voice changes, part of speech changes and breaking long sentence.
Keywords :
knowledge representation; natural language processing; query processing; text analysis; Chinese documents; breaking long sentence; complicated plagiarism pattern identification; document copy detection; frame-based Chinese semantic knowledge representation; hashed chunk matching; overlap measure method; paraphrase overlap detection; part of speech changes; query document plagiarize content; single-word synonym; voice changes; Compounds; Databases; Knowledge representation; Plagiarism; Semantics; Speech; Syntactics;
Conference_Titel :
Intelligent Control and Information Processing (ICICIP), 2012 Third International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4577-2144-1
DOI :
10.1109/ICICIP.2012.6391442