Title of article :
English-Persian Plagiarism Detection based on a Semantic Approach
Author/Authors :
Safi-Esfahani ، F. - Islamic Azad University, Najafabad Branch , Rakian ، Sh. - Islamic Azad University, Najafabad Branch , Nadimi-Shahraki ، M.H. - Islamic Azad University, Najafabad Branch
Abstract :
Plagiarism, defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them”, poses a major challenge to knowledge spread publication. Plagiarism has been placed in the four categories of direct, paraphrasing (re-writing), translation, and combinatory. This paper addresses the translational plagiarism, which is sometimes referred to as the cross-lingual plagiarism. In cross-lingual translation, writers meld a translation with their own words and ideas. Based on the monolingual plagiarism detection methods, this paper ultimately intends to find a way to detect the crosslingual plagiarism. A framework called multi-lingual plagiarism detection (MLPD) has been presented for the cross-lingual plagiarism analysis with the ultimate objective of detection of plagiarism cases. English is the reference language, and Persian materials are back-translated using the translation tools. The data used for MLPD assessment is obtained from English-Persian Mizan parallel corpus. Apache’s Solr is also applied to record the creep of the documents and their indexation. The accuracy mean of the proposed method was revealed to be 98.82% when employing highly accurate translation tools, which indicate the high accuracy of the method. Also the Google translation service showed the accuracy mean to be 56.9%. These tests demonstrate that the improved translation tools enhance the accuracy of the developed method.
Keywords :
Text Retrieval , Cross , lingual , Text Similarity , Translation , Plagiarism , Semantic , based Plagiarism Detection
Journal title :
Journal of Artificial Intelligence Data Mining
Journal title :
Journal of Artificial Intelligence Data Mining