مرکز منطقه ای اطلاع رساني علوم و فناوري - MAS a scalable framework for research effort evaluation by unsupervised machine learning-Hybrid plagiarism model

DocumentCode :

2456399

Title :

MAS a scalable framework for research effort evaluation by unsupervised machine learning-Hybrid plagiarism model

Author :

Shinde, Sachin V. ; Gawali, Sangram Z. ; Thakore, Devendrasingh M.

Author_Institution :

Inf. Technol., B.V.D.U.s Coll. of Eng., Pune, India

fYear :

2015

fDate :

8-10 Jan. 2015

Firstpage :

Lastpage :

Abstract :

In the era of web new information is upcoming day by day. Researches add their work for their research domains. Detecting of originality of research work is in hype. In Academic sector students researchers bring in innovative ideas, algorithms stating that their work outperforms prior research. They may implement NULL Hypothesis or alternative Hypothesis, detecting their effort is a challenge. By means of plagiarism detectors such academic efforts can be evaluated or graded. This reflects the essence of research in the field of Plagiarized content detection and grading. Some of our research issue highlights to technical scenario to design an algorithm which is adaptable to changing nature of dataset. The dataset grows, as new research work is added in due course of time. Data extraction from unstructured information is challenging, as no standard pattern is yet defined. Such patterns vary from research to research and are domain specific. A document in question i.e plagiarized or not? Is a join of one or more sentences that originate by the authors research or referenced from previous publications. Authors to prove originality use paraphrasing which may have semantic similarity, also some of the contents act as metaphor for upcoming research work. It is complex task point out such an activity. Methodology states that a document in question is a join of sentences, whereas each sentence is a join of terms. Thus we conclude by fork and join operations; plagiarism detection is possible in effective way. Document in question is split to produce a sentence vector. A term vector is generated by forking sentence to terms for each sentence in sentence vector. Mapper is implemented that maps term to sentence and sentence to source document. To enhance the accuracy of the model a Multi Agent Based System MAS frame is recommended to adapt varying similarity functions. Achieve parallelism in system and adaptability of new similarity measures as well remove one which are not sui- able any more to the task.

Keywords :

Internet; information retrieval; multi-agent systems; text analysis; unsupervised learning; MAS; MAS frame; NULL hypothesis; academic sector student researchers; alternative hypothesis; data extraction; innovative ideas; multiagent based system frame; paraphrasing; plagiarism detection; plagiarism detectors; plagiarized content detection; plagiarized content grading; research effort evaluation; scalable framework; semantic similarity; sentence vector; similarity functions; unstructured information; unsupervised machine learning-hybrid plagiarism model; Accuracy; Algorithm design and analysis; Classification algorithms; Data mining; Plagiarism; Semantics; Silicon; Cosine similarity; Document in Question; EMA; Inverted Index; MAS; Mapper; PMA; SMA; Sentence vector; Term Vector; Unsupervised Learning; WEMA; WPMA; WSMA; fork; join;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Pervasive Computing (ICPC), 2015 International Conference on

Conference_Location :

Pune

Type :

conf

DOI :

10.1109/PERVASIVE.2015.7087030

Filename :

7087030

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2456399