• DocumentCode
    1733673
  • Title

    Automatic Grading of Computer Programs: A Machine Learning Approach

  • Author

    Srikant, Shashank ; Aggarwal, Vaneet

  • Volume
    1
  • fYear
    2013
  • Firstpage
    85
  • Lastpage
    92
  • Abstract
    The automatic evaluation of computer programs is a nascent area of research with a potential for large-scale impact. Extant program assessment systems score mostly based on the number of test-cases passed, providing no insight into the competency of the programmer. In this paper, we present a machine learning framework to automatically grade computer programs. We propose a set of highly-informative features, derived from the abstract representations of a given program, that capture the program´s functionality. These features are then used to learn a model to grade the programs, which are built against evaluations done by experts on the basis of a rubric. We show that regression modeling based on the given features provide much better grading than the ubiquitous test-case-pass based grading and rivals the grading accuracy of other open-response problems such as essay grading. We also show that our novel features add significant value over and above basic keyword/expression count features. In addition to this, we propose a novel way of posing computer-program grading as a one-class modeling problem. Our preliminary investigations in the same show promising results and suggest an implicit correlation of our features with the proposed grading-levels (rubric). To the best of the authors´ knowledge, this is the first time machine learning has been applied to the problem of grading programs. The work is timely with regard to the recent boom in Massively Online Open Courseware (MOOCs), which promises to produce a significant amount of hand-graded digitized data.
  • Keywords
    courseware; learning (artificial intelligence); program testing; regression analysis; ubiquitous computing; MOOC; abstract representations; automatic computer program evaluation; automatic computer-program grading; expression count features; extant program assessment system score; hand-graded digitized data; highly-informative features; keyword count features; machine learning framework; massively online open courseware; one-class modeling problem; open-response problems; program functionality; regression modeling; ubiquitous test-case-pass based grading; Abstracts; Computers; Context; Feature extraction; Grammar; Measurement; Programming; Abstract Syntax Trees; Assessment of Computer Programs; MOOC; One-Class Modeling; Regression; Rubric;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2013 12th International Conference on
  • Conference_Location
    Miami, FL
  • Type

    conf

  • DOI
    10.1109/ICMLA.2013.22
  • Filename
    6784592