Identifying Student Profiles Within Online Judge Systems Using
Explainable Artificial Intelligence
- URL: http://arxiv.org/abs/2402.03948v1
- Date: Mon, 29 Jan 2024 12:11:30 GMT
- Title: Identifying Student Profiles Within Online Judge Systems Using
Explainable Artificial Intelligence
- Authors: Juan Ram\'on Rico-Juan, V\'ictor M. S\'anchez-Cartagena, Jose J.
Valero-Mas, Antonio Javier Gallego
- Abstract summary: Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students.
This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor.
- Score: 6.638206014723678
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online Judge (OJ) systems are typically considered within programming-related
courses as they yield fast and objective assessments of the code developed by
the students. Such an evaluation generally provides a single decision based on
a rubric, most commonly whether the submission successfully accomplished the
assignment. Nevertheless, since in an educational context such information may
be deemed insufficient, it would be beneficial for both the student and the
instructor to receive additional feedback about the overall development of the
task. This work aims to tackle this limitation by considering the further
exploitation of the information gathered by the OJ and automatically inferring
feedback for both the student and the instructor. More precisely, we consider
the use of learning-based schemes -- particularly, multi-instance learning
(MIL) and classical machine learning formulations -- to model student behavior.
Besides, explainable artificial intelligence (XAI) is contemplated to provide
human-understandable feedback. The proposal has been evaluated considering a
case of study comprising 2500 submissions from roughly 90 different students
from a programming-related course in a computer science degree. The results
obtained validate the proposal: The model is capable of significantly
predicting the user outcome (either passing or failing the assignment) solely
based on the behavioral pattern inferred by the submissions provided to the OJ.
Moreover, the proposal is able to identify prone-to-fail student groups and
profiles as well as other relevant information, which eventually serves as
feedback to both the student and the instructor.
Related papers
- AERA Chat: An Interactive Platform for Automated Explainable Student Answer Assessment [12.970776782360366]
AERA Chat is an interactive platform to provide visually explained assessment of student answers.
Users can input questions and student answers to obtain automated, explainable assessment results from large language models.
arXiv Detail & Related papers (2024-10-12T11:57:53Z) - Evaluating and Optimizing Educational Content with Large Language Model Judgments [52.33701672559594]
We use Language Models (LMs) as educational experts to assess the impact of various instructions on learning outcomes.
We introduce an instruction optimization approach in which one LM generates instructional materials using the judgments of another LM as a reward function.
Human teachers' evaluations of these LM-generated worksheets show a significant alignment between the LM judgments and human teacher preferences.
arXiv Detail & Related papers (2024-03-05T09:09:15Z) - Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development.
To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps.
These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z) - Exploring the Potential of Large Language Models to Generate Formative
Programming Feedback [0.5371337604556311]
We explore the potential of large language models (LLMs) for computing educators and learners.
To achieve these goals, we used students' programming sequences from a dataset gathered within a CS1 course as input for ChatGPT.
Results show that ChatGPT performs reasonably well for some of the introductory programming tasks and student errors.
However, educators should provide guidance on how to use the provided feedback, as it can contain misleading information for novices.
arXiv Detail & Related papers (2023-08-31T15:22:11Z) - Giving Feedback on Interactive Student Programs with Meta-Exploration [74.5597783609281]
Developing interactive software, such as websites or games, is a particularly engaging way to learn computer science.
Standard approaches require instructors to manually grade student-implemented interactive programs.
Online platforms that serve millions, like Code.org, are unable to provide any feedback on assignments for implementing interactive programs.
arXiv Detail & Related papers (2022-11-16T10:00:23Z) - A Multicriteria Evaluation for Data-Driven Programming Feedback Systems:
Accuracy, Effectiveness, Fallibility, and Students' Response [7.167352606079407]
Data-driven programming feedback systems can help novices to program in the absence of a human tutor.
Prior evaluations showed that these systems improve learning in terms of test scores, or task completion efficiency.
These aspects include inherent fallibility of current state-of-the-art, students' programming behavior in response to correct/incorrect feedback, and effective/ineffective system components.
arXiv Detail & Related papers (2022-07-27T00:29:32Z) - ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification.
A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors.
Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z) - Effects of Human vs. Automatic Feedback on Students' Understanding of AI
Concepts and Programming Style [0.0]
The use of automatic grading tools has become nearly ubiquitous in large undergraduate programming courses.
There is a relative lack of data directly comparing student outcomes when receiving computer-generated feedback and human-written feedback.
This paper addresses this gap by splitting one 90-student class into two feedback groups and analyzing differences in the two cohorts' performance.
arXiv Detail & Related papers (2020-11-20T21:40:32Z) - Toward Machine-Guided, Human-Initiated Explanatory Interactive Learning [9.887110107270196]
Recent work has demonstrated the promise of combining local explanations with active learning for understanding and supervising black-box models.
Here we show that, under specific conditions, these algorithms may misrepresent the quality of the model being learned.
We address this narrative bias by introducing explanatory guided learning.
arXiv Detail & Related papers (2020-07-20T11:51:31Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.