Related papers: Toward Semi-Automatic Misconception Discovery Using Code Embeddings

Toward Semi-Automatic Misconception Discovery Using Code Embeddings

URL: http://arxiv.org/abs/2103.04448v1
Date: Sun, 7 Mar 2021 20:32:41 GMT
Title: Toward Semi-Automatic Misconception Discovery Using Code Embeddings
Authors: Yang Shi, Krupal Shah, Wengran Wang, Samiha Marwan, Poorvaja Penmetsa and Thomas W. Price
Abstract summary: We present a novel method for the semi-automated discovery of problem-specific misconceptions from students' program code in computing courses. We trained the model on a block-based programming dataset and used the learned embedding to cluster incorrect student submissions.
Score: 4.369757255496184
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding students' misconceptions is important for effective teaching and assessment. However, discovering such misconceptions manually can be time-consuming and laborious. Automated misconception discovery can address these challenges by highlighting patterns in student data, which domain experts can then inspect to identify misconceptions. In this work, we present a novel method for the semi-automated discovery of problem-specific misconceptions from students' program code in computing courses, using a state-of-the-art code classification model. We trained the model on a block-based programming dataset and used the learned embedding to cluster incorrect student submissions. We found these clusters correspond to specific misconceptions about the problem and would not have been easily discovered with existing approaches. We also discuss potential applications of our approach and how these misconceptions inform domain-specific insights into students' learning processes.

Related papers

Algorithms for Adversarially Robust Deep Learning [58.656107500646364]
We discuss recent progress toward designing algorithms that exhibit desirable robustness properties.<n>We present new algorithms that achieve state-of-the-art generalization in medical imaging, molecular identification, and image classification.<n>We propose new attacks and defenses, which represent the frontier of progress toward designing robust language-based agents.
arXiv Detail & Related papers (2025-09-23T14:48:58Z)
Automated Identification of Logical Errors in Programs: Advancing Scalable Analysis of Student Misconceptions [4.0782995609938]
This paper presents a scalable framework for automatically detecting logical errors in students' programming solutions.<n>Our framework is based on an explainable Abstract Syntax Tree (AST) embedding model, the Subtree-based Attention Neural Network (SANN)
arXiv Detail & Related papers (2025-05-16T06:32:51Z)
Probing the Unknown: Exploring Student Interactions with Probeable Problems at Scale in Introductory Programming [4.1153199495993364]
This study explores the use of Probeable Problems'', automatically gradable tasks that have deliberately vague or incomplete specifications. Such problems require students to submit test inputs, or probes', to clarify requirements before implementation. Systematic strategies, such as thoroughly exploring expected behavior before coding, resulted in fewer incorrect code submissions and correlated with course success.
arXiv Detail & Related papers (2025-04-16T02:50:00Z)
RESTOR: Knowledge Recovery through Machine Unlearning [71.75834077528305]
Large language models trained on web-scale corpora can memorize undesirable datapoints. Many machine unlearning methods have been proposed that aim to 'erase' these datapoints from trained models. We propose the RESTOR framework for machine unlearning based on the following dimensions.
arXiv Detail & Related papers (2024-10-31T20:54:35Z)
LLM-based Cognitive Models of Students with Misconceptions [55.29525439159345]
This paper investigates whether Large Language Models (LLMs) can be instruction-tuned to meet this dual requirement. We introduce MalAlgoPy, a novel Python library that generates datasets reflecting authentic student solution patterns. Our insights enhance our understanding of AI-based student models and pave the way for effective adaptive learning systems.
arXiv Detail & Related papers (2024-10-16T06:51:09Z)
Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge Tracing [59.480951050911436]
We present KCQRL, a framework for automated knowledge concept annotation and question representation learning. We demonstrate the effectiveness of KCQRL across 15 KT algorithms on two large real-world Math learning datasets.
arXiv Detail & Related papers (2024-10-02T16:37:19Z)
Counterfactual Explanations for Clustering Models [11.40145394568897]
Clustering algorithms rely on complex optimisation processes that may be difficult to comprehend. We propose a new, model-agnostic technique for explaining clustering algorithms with counterfactual statements.
arXiv Detail & Related papers (2024-09-19T10:05:58Z)
An Approach to Detect Abnormal Submissions for CodeWorkout Dataset [8.142354661558752]
This paper presents a preliminary study to analyze log data with anomalies. The goal of our work is to overcome the abnormal instances when modeling personalizable recommendations in programming learning environments.
arXiv Detail & Related papers (2024-06-28T00:26:15Z)
Creating a Trajectory for Code Writing: Algorithmic Reasoning Tasks [0.923607423080658]
This paper describes instruments and the machine learning models used for validating them. We have used the data collected in an introductory programming course in the penultimate week of the semester. Preliminary research suggests ART type instruments can be combined with specific machine learning models to act as an effective learning trajectory.
arXiv Detail & Related papers (2024-04-03T05:07:01Z)
Automatic Classification of Error Types in Solutions to Programming Assignments at Online Learning Platform [4.028503203417233]
We apply machine learning methods to improve the feedback of automated verification systems for programming assignments. We detect frequent error types by clustering previously submitted incorrect solutions, label these clusters and use this labeled dataset to identify the type of an error in a new submission.
arXiv Detail & Related papers (2021-07-13T11:59:57Z)
Low-Regret Active learning [64.36270166907788]
We develop an online learning algorithm for identifying unlabeled data points that are most informative for training. At the core of our work is an efficient algorithm for sleeping experts that is tailored to achieve low regret on predictable (easy) instances.
arXiv Detail & Related papers (2021-04-06T22:53:45Z)
Knowledge as Invariance -- History and Perspectives of Knowledge-augmented Machine Learning [69.99522650448213]
Research in machine learning is at a turning point. Research interests are shifting away from increasing the performance of highly parameterized models to exceedingly specific tasks. This white paper provides an introduction and discussion of this emerging field in machine learning research.
arXiv Detail & Related papers (2020-12-21T15:07:19Z)
A Survey of Machine Learning Methods and Challenges for Windows Malware Classification [43.4550536920809]
Survey aims to be useful both to cybersecurity practitioners who wish to learn more about how machine learning can be applied to the malware problem, and to give data scientists the necessary background into the challenges in this uniquely complicated space.
arXiv Detail & Related papers (2020-06-15T17:46:12Z)
Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data. Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model. Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.