Automatic Classification of Error Types in Solutions to Programming
Assignments at Online Learning Platform
- URL: http://arxiv.org/abs/2107.06009v1
- Date: Tue, 13 Jul 2021 11:59:57 GMT
- Title: Automatic Classification of Error Types in Solutions to Programming
Assignments at Online Learning Platform
- Authors: Artyom Lobanov, Timofey Bryksin, Alexey Shpilman
- Abstract summary: We apply machine learning methods to improve the feedback of automated verification systems for programming assignments.
We detect frequent error types by clustering previously submitted incorrect solutions, label these clusters and use this labeled dataset to identify the type of an error in a new submission.
- Score: 4.028503203417233
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online programming courses are becoming more and more popular, but they still
have significant drawbacks when compared to the traditional education system,
e.g., the lack of feedback. In this study, we apply machine learning methods to
improve the feedback of automated verification systems for programming
assignments. We propose an approach that provides an insight on how to fix the
code for a given incorrect submission. To achieve this, we detect frequent
error types by clustering previously submitted incorrect solutions, label these
clusters and use this labeled dataset to identify the type of an error in a new
submission. We examine and compare several approaches to the detection of
frequent error types and to the assignment of clusters to new submissions. The
proposed method is evaluated on a dataset provided by a popular online learning
platform.
Related papers
- Understanding and Mitigating Classification Errors Through Interpretable
Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors.
We propose to discover those patterns of tokens that distinguish correct and erroneous predictions.
We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z) - XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification.
XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations.
Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z) - Probabilistic Safety Regions Via Finite Families of Scalable Classifiers [2.431537995108158]
Supervised classification recognizes patterns in the data to separate classes of behaviours.
Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning.
We introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled.
arXiv Detail & Related papers (2023-09-08T22:40:19Z) - Misclassification in Automated Content Analysis Causes Bias in
Regression. Can We Fix It? Yes We Can! [0.30693357740321775]
We show in a systematic literature review that communication scholars largely ignore misclassification bias.
Existing statistical methods can use "gold standard" validation data, such as that created by human annotators, to correct misclassification bias.
We introduce and test such methods, including a new method we design and implement in the R package misclassificationmodels.
arXiv Detail & Related papers (2023-07-12T23:03:55Z) - Annotation Error Detection: Analyzing the Past and Present for a More
Coherent Future [63.99570204416711]
We reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets.
We define a uniform evaluation setup including a new formalization of the annotation error detection task.
We release our datasets and implementations in an easy-to-use and open source software package.
arXiv Detail & Related papers (2022-06-05T22:31:45Z) - Opinion Spam Detection: A New Approach Using Machine Learning and
Network-Based Algorithms [2.062593640149623]
Online reviews play a crucial role in helping consumers evaluate and compare products and services.
Fake reviews (opinion spam) are becoming more prevalent and negatively impacting customers and service providers.
We propose a new method for classifying reviewers as spammers or benign, combining machine learning with a message-passing algorithm.
arXiv Detail & Related papers (2022-05-26T15:27:46Z) - Resolving label uncertainty with implicit posterior models [71.62113762278963]
We propose a method for jointly inferring labels across a collection of data samples.
By implicitly assuming the existence of a generative model for which a differentiable predictor is the posterior, we derive a training objective that allows learning under weak beliefs.
arXiv Detail & Related papers (2022-02-28T18:09:44Z) - AutoNovel: Automatically Discovering and Learning Novel Visual
Categories [138.80332861066287]
We present a new approach called AutoNovel to tackle the problem of discovering novel classes in an image collection given labelled examples of other classes.
We evaluate AutoNovel on standard classification benchmarks and substantially outperform current methods for novel category discovery.
arXiv Detail & Related papers (2021-06-29T11:12:16Z) - Toward Semi-Automatic Misconception Discovery Using Code Embeddings [4.369757255496184]
We present a novel method for the semi-automated discovery of problem-specific misconceptions from students' program code in computing courses.
We trained the model on a block-based programming dataset and used the learned embedding to cluster incorrect student submissions.
arXiv Detail & Related papers (2021-03-07T20:32:41Z) - Pattern Learning for Detecting Defect Reports and Improvement Requests
in App Reviews [4.460358746823561]
In this study, we follow novel approaches that target this absence of actionable insights by classifying reviews as defect reports and requests for improvement.
We employ a supervised system that is capable of learning lexico-semantic patterns through genetic programming.
We show that the automatically learned patterns outperform the manually created ones, to be generated.
arXiv Detail & Related papers (2020-04-19T08:13:13Z) - Automatically Discovering and Learning New Visual Categories with
Ranking Statistics [145.89790963544314]
We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes.
We learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data.
We evaluate our approach on standard classification benchmarks and outperform current methods for novel category discovery by a significant margin.
arXiv Detail & Related papers (2020-02-13T18:53:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.