Title :
Modeling and Quantification of Superficial Features Extracted from Source Codes: In Consideration of Fluctuation of Description among Learning Data
Author :
Ohno, Asako ; Murao, Hajime
Author_Institution :
Dept. of Life Design, Shijonawate Gakuen Junior Coll., Daito, Japan
Abstract :
In this paper, we extend our previous work on the new similarity measure for a source code plagiarism detection in academia. The method extracts superficial features from source codes and represents as an author´s coding style by the HMM-based stochastic models called coding models. The outputs of the coding models give us information to judge if a source code had been produced by the student who submitted the source code. The paper gives an explanation of the extended method which accepts fluctuations of descriptions among source codes as a part of an author´s coding style. The paper also provides results of evaluation experiments. Another contribution of the paper is that it proposed the novel methodology that utilized our extended method to check the outputs of an existing method. Results of experiments showed that the new methodology was effective to reduce false positives in the outputs of the existing method.
Keywords :
feature extraction; hidden Markov models; source coding; HMM-based stochastic models; author coding style; coding models; hidden Markov model; source code plagiarism detection; superficial features extraction modeling; Algorithm design and analysis; Data mining; Educational institutions; Feature extraction; Fluctuations; Hidden Markov models; Internet; Plagiarism; Probability; Stochastic processes;
Conference_Titel :
Innovative Computing, Information and Control (ICICIC), 2009 Fourth International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-1-4244-5543-0
DOI :
10.1109/ICICIC.2009.263