Toward Semi-Automatic Misconception Discovery Using Code Embeddings
- URL: http://arxiv.org/abs/2103.04448v1
- Date: Sun, 7 Mar 2021 20:32:41 GMT
- Title: Toward Semi-Automatic Misconception Discovery Using Code Embeddings
- Authors: Yang Shi, Krupal Shah, Wengran Wang, Samiha Marwan, Poorvaja Penmetsa
and Thomas W. Price
- Abstract summary: We present a novel method for the semi-automated discovery of problem-specific misconceptions from students' program code in computing courses.
We trained the model on a block-based programming dataset and used the learned embedding to cluster incorrect student submissions.
- Score: 4.369757255496184
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding students' misconceptions is important for effective teaching
and assessment. However, discovering such misconceptions manually can be
time-consuming and laborious. Automated misconception discovery can address
these challenges by highlighting patterns in student data, which domain experts
can then inspect to identify misconceptions. In this work, we present a novel
method for the semi-automated discovery of problem-specific misconceptions from
students' program code in computing courses, using a state-of-the-art code
classification model. We trained the model on a block-based programming dataset
and used the learned embedding to cluster incorrect student submissions. We
found these clusters correspond to specific misconceptions about the problem
and would not have been easily discovered with existing approaches. We also
discuss potential applications of our approach and how these misconceptions
inform domain-specific insights into students' learning processes.
Related papers
- Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - Creating a Trajectory for Code Writing: Algorithmic Reasoning Tasks [0.923607423080658]
This paper describes instruments and the machine learning models used for validating them.
We have used the data collected in an introductory programming course in the penultimate week of the semester.
Preliminary research suggests ART type instruments can be combined with specific machine learning models to act as an effective learning trajectory.
arXiv Detail & Related papers (2024-04-03T05:07:01Z) - Identifying Student Profiles Within Online Judge Systems Using
Explainable Artificial Intelligence [6.638206014723678]
Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students.
This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor.
arXiv Detail & Related papers (2024-01-29T12:11:30Z) - Explainable Data-Driven Optimization: From Context to Decision and Back
Again [76.84947521482631]
Data-driven optimization uses contextual information and machine learning algorithms to find solutions to decision problems with uncertain parameters.
We introduce a counterfactual explanation methodology tailored to explain solutions to data-driven problems.
We demonstrate our approach by explaining key problems in operations management such as inventory management and routing.
arXiv Detail & Related papers (2023-01-24T15:25:16Z) - Design Automation for Fast, Lightweight, and Effective Deep Learning
Models: A Survey [53.258091735278875]
This survey covers studies of design automation techniques for deep learning models targeting edge computing.
It offers an overview and comparison of key metrics that are used commonly to quantify the proficiency of models in terms of effectiveness, lightness, and computational costs.
The survey proceeds to cover three categories of the state-of-the-art of deep model design automation techniques.
arXiv Detail & Related papers (2022-08-22T12:12:43Z) - Automatic Classification of Error Types in Solutions to Programming
Assignments at Online Learning Platform [4.028503203417233]
We apply machine learning methods to improve the feedback of automated verification systems for programming assignments.
We detect frequent error types by clustering previously submitted incorrect solutions, label these clusters and use this labeled dataset to identify the type of an error in a new submission.
arXiv Detail & Related papers (2021-07-13T11:59:57Z) - Low-Regret Active learning [64.36270166907788]
We develop an online learning algorithm for identifying unlabeled data points that are most informative for training.
At the core of our work is an efficient algorithm for sleeping experts that is tailored to achieve low regret on predictable (easy) instances.
arXiv Detail & Related papers (2021-04-06T22:53:45Z) - Knowledge as Invariance -- History and Perspectives of
Knowledge-augmented Machine Learning [69.99522650448213]
Research in machine learning is at a turning point.
Research interests are shifting away from increasing the performance of highly parameterized models to exceedingly specific tasks.
This white paper provides an introduction and discussion of this emerging field in machine learning research.
arXiv Detail & Related papers (2020-12-21T15:07:19Z) - A Survey of Machine Learning Methods and Challenges for Windows Malware
Classification [43.4550536920809]
Survey aims to be useful both to cybersecurity practitioners who wish to learn more about how machine learning can be applied to the malware problem, and to give data scientists the necessary background into the challenges in this uniquely complicated space.
arXiv Detail & Related papers (2020-06-15T17:46:12Z) - Pattern Learning for Detecting Defect Reports and Improvement Requests
in App Reviews [4.460358746823561]
In this study, we follow novel approaches that target this absence of actionable insights by classifying reviews as defect reports and requests for improvement.
We employ a supervised system that is capable of learning lexico-semantic patterns through genetic programming.
We show that the automatically learned patterns outperform the manually created ones, to be generated.
arXiv Detail & Related papers (2020-04-19T08:13:13Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.