Related papers: Using Large Language Models for Student-Code Guided Test Case Generation in Computer Science Education

Using Large Language Models for Student-Code Guided Test Case Generation in Computer Science Education

URL: http://arxiv.org/abs/2402.07081v1
Date: Sun, 11 Feb 2024 01:37:48 GMT
Title: Using Large Language Models for Student-Code Guided Test Case Generation in Computer Science Education
Authors: Nischal Ashok Kumar, Andrew Lan
Abstract summary: Test cases are an integral part of programming assignments in computer science education. Test cases can be used as assessment items to test students' programming knowledge and provide personalized feedback on student-written code. We propose a large language model-based approach to automatically generate test cases.
Score: 2.5382095320488665
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In computer science education, test cases are an integral part of programming assignments since they can be used as assessment items to test students' programming knowledge and provide personalized feedback on student-written code. The goal of our work is to propose a fully automated approach for test case generation that can accurately measure student knowledge, which is important for two reasons. First, manually constructing test cases requires expert knowledge and is a labor-intensive process. Second, developing test cases for students, especially those who are novice programmers, is significantly different from those oriented toward professional-level software developers. Therefore, we need an automated process for test case generation to assess student knowledge and provide feedback. In this work, we propose a large language model-based approach to automatically generate test cases and show that they are good measures of student knowledge, using a publicly available dataset that contains student-written Java code. We also discuss future research directions centered on using test cases to help students.

Related papers

Automated Test Generation from Program Documentation Encoded in Code Comments [4.696083734269232]
This paper introduces a novel test generation technique that exploits the code-comment documentation constructively. We deliver test cases with names and oracles properly contextualized on the target behaviors.
arXiv Detail & Related papers (2025-04-29T20:23:56Z)
Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance. We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z)
Test Case-Informed Knowledge Tracing for Open-ended Coding Tasks [42.22663501257155]
Open-ended coding tasks are common in computer science education. Traditional knowledge tracing (KT) models that only analyze response correctness may not fully capture nuances in student knowledge from student code. We introduce Test case-Informed Knowledge Tracing for Open-ended Coding (TIKTOC), a framework to simultaneously analyze and predict both open-ended student code and whether the code passes each test case.
arXiv Detail & Related papers (2024-09-28T03:13:40Z)
Identifying Student Profiles Within Online Judge Systems Using Explainable Artificial Intelligence [6.638206014723678]
Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students. This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor.
arXiv Detail & Related papers (2024-01-29T12:11:30Z)
Next-Step Hint Generation for Introductory Programming Using Large Language Models [0.8002196839441036]
Large Language Models possess skills such as answering questions, writing essays or solving programming exercises. This work explores how LLMs can contribute to programming education by supporting students with automated next-step hints.
arXiv Detail & Related papers (2023-12-03T17:51:07Z)
Generating and Evaluating Tests for K-12 Students with Language Model Simulations: A Case Study on Sentence Reading Efficiency [45.6224547703717]
This study focuses on tests of silent sentence reading efficiency, used to assess students' reading ability over time. We propose to fine-tune large language models (LLMs) to simulate how previous students would have responded to unseen items. We show the generated tests closely correspond to the original test's difficulty and reliability based on crowdworker responses.
arXiv Detail & Related papers (2023-10-10T17:59:51Z)
Exploring the Potential of Large Language Models to Generate Formative Programming Feedback [0.5371337604556311]
We explore the potential of large language models (LLMs) for computing educators and learners. To achieve these goals, we used students' programming sequences from a dataset gathered within a CS1 course as input for ChatGPT. Results show that ChatGPT performs reasonably well for some of the introductory programming tasks and student errors. However, educators should provide guidance on how to use the provided feedback, as it can contain misleading information for novices.
arXiv Detail & Related papers (2023-08-31T15:22:11Z)
Supporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding [45.5690960017762]
This study explores the use of large language models (LLMs) in supporting deductive coding. Instead of training task-specific models, a pre-trained LLM could be used directly for various tasks without fine-tuning through prompt learning. Using a curiosity-driven questions coding task as a case study, we found, by combining GPT-3 with expert-drafted codebooks, our proposed approach achieved fair to substantial agreements with expert-coded results.
arXiv Detail & Related papers (2023-04-17T04:52:43Z)
Towards Informed Design and Validation Assistance in Computer Games Using Imitation Learning [65.12226891589592]
This paper proposes a new approach to automated game validation and testing. Our method leverages a data-driven imitation learning technique, which requires little effort and time and no knowledge of machine learning or programming.
arXiv Detail & Related papers (2022-08-15T11:08:44Z)
What should I Ask: A Knowledge-driven Approach for Follow-up Questions Generation in Conversational Surveys [63.51903260461746]
We propose a novel task for knowledge-driven follow-up question generation in conversational surveys. We constructed a new human-annotated dataset of human-written follow-up questions with dialogue history and labeled knowledge. We then propose a two-staged knowledge-driven model for the task, which generates informative and coherent follow-up questions.
arXiv Detail & Related papers (2022-05-23T00:57:33Z)
Data-Driven Approach for Log Instruction Quality Assessment [59.04636530383049]
There are no widely adopted guidelines on how to write log instructions with good quality properties. We identify two quality properties: 1) correct log level assignment assessing the correctness of the log level, and 2) sufficient linguistic structure assessing the minimal richness of the static text necessary for verbose event description. Our approach correctly assesses log level assignments with an accuracy of 0.88, and the sufficient linguistic structure with an F1 score of 0.99, outperforming the baselines.
arXiv Detail & Related papers (2022-04-06T07:02:23Z)
ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification. A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors. Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z)
How do students test software units? [3.6748639131154315]
We asked students to fill in a small survey, to do four exercises and to fill a second survey. We interviewed eleven students in semi-structured interviews, to obtain more in-depth insight. One of the misconceptions we found is that most students can only think of test cases based on programming code. Even if no code was provided (black-box testing), students try to come up with code to base their test cases on.
arXiv Detail & Related papers (2021-02-16T07:02:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.