ParaStudent: Generating and Evaluating Realistic Student Code by Teaching LLMs to Struggle
- URL: http://arxiv.org/abs/2507.12674v2
- Date: Fri, 18 Jul 2025 01:02:16 GMT
- Title: ParaStudent: Generating and Evaluating Realistic Student Code by Teaching LLMs to Struggle
- Authors: Mihran Miroyan, Rose Niousha, Joseph E. Gonzalez, Gireeja Ranade, Narges Norouzi,
- Abstract summary: Large Language Models (LLMs) have shown strong performance on programming tasks, but can they generate student-like code like real students?<n>We present ParaStudent, a systematic study of LLM-based "student-like" code generation in an introductory programming course setting.
- Score: 24.691302820912888
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have shown strong performance on programming tasks, but can they generate student-like code like real students - imperfect, iterative, and stylistically diverse? We present ParaStudent, a systematic study of LLM-based "student-like" code generation in an introductory programming course setting. Using a dataset of timestamped student submissions across multiple semesters, we design low- and high-resolution experiments to model student progress and evaluate code outputs along semantic, functional, and stylistic dimensions. Our results show that fine-tuning significantly improves alignment with real student trajectories and captures error patterns, incremental improvements, and stylistic variations more faithfully. This study shows that modeling realistic student code requires capturing learning dynamics through context-aware generation, temporal modeling, and multi-dimensional evaluation. Code for experiments and evaluation is available at https://github.com/mmiroyan/ParaStudent.
Related papers
- Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents [36.704574105201864]
Large language models (LLMs) are revolutionizing education, with LLM-based agents playing a key role in simulating student behavior.<n>A major challenge in student simulation is modeling the diverse learning patterns of students at various cognitive levels.
arXiv Detail & Related papers (2025-05-26T13:48:49Z) - Learning Code-Edit Embedding to Model Student Debugging Behavior [2.1485350418225244]
We propose an encoder-decoder-based model that learns meaningful code-edit embeddings between consecutive student code submissions.<n>It enables personalized next-step code suggestions that maintain the student's coding style while improving test case correctness.
arXiv Detail & Related papers (2025-02-26T18:54:39Z) - Classroom Simulacra: Building Contextual Student Generative Agents in Online Education for Learning Behavioral Simulation [10.209326669619273]
We run a 6-week education workshop from N = 60 students to collect fine-grained data using a custom built online education system.<n>We propose a transferable iterative reflection (TIR) module that augments both prompting-based and finetuning-based large language models.
arXiv Detail & Related papers (2025-02-04T23:42:52Z) - DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback [62.235925602004535]
DataEnvGym is a testbed of teacher environments for data generation agents.<n>It frames data generation as a sequential decision-making task, involving an agent and a data generation engine.<n>Students are iteratively trained and evaluated on generated data, and their feedback is reported to the agent after each iteration.
arXiv Detail & Related papers (2024-10-08T17:20:37Z) - Toward In-Context Teaching: Adapting Examples to Students' Misconceptions [54.82965010592045]
We introduce a suite of models and evaluation methods we call AdapT.
AToM is a new probabilistic model for adaptive teaching that jointly infers students' past beliefs and optimize for the correctness of future beliefs.
Our results highlight both the difficulty of the adaptive teaching task and the potential of learned adaptive models for solving it.
arXiv Detail & Related papers (2024-05-07T17:05:27Z) - Code Representation Learning At Scale [75.04686476303436]
We fuel code representation learning with a vast amount of code data via a two-stage pretraining scheme.
We first train the encoders via a mix that leverages both randomness in masking language modeling and the structure aspect of programming language.
We then enhance the representations via contrastive learning with hard negative and hard positive constructed in an unsupervised manner.
arXiv Detail & Related papers (2024-02-02T22:19:15Z) - Large Language Models for In-Context Student Modeling: Synthesizing Student's Behavior in Visual Programming [29.65988680948297]
We explore the application of large language models (LLMs) for in-context student modeling in open-ended learning domains.
We introduce a novel framework, LLM for Student Synthesis (LLM-SS), that leverages LLMs for a student's behavior.
We instantiate several methods based on LLM-SS framework and evaluate them using an existing benchmark, StudentSyn, for student attempt synthesis in a visual programming domain.
arXiv Detail & Related papers (2023-10-15T12:56:13Z) - LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning [64.55001982176226]
LIBERO is a novel benchmark of lifelong learning for robot manipulation.
We focus on how to efficiently transfer declarative knowledge, procedural knowledge, or the mixture of both.
We develop an extendible procedural generation pipeline that can in principle generate infinitely many tasks.
arXiv Detail & Related papers (2023-06-05T23:32:26Z) - Multi-granulariy Time-based Transformer for Knowledge Tracing [9.788039182463768]
We leverage students historical data, including their past test scores, to create a personalized model for each student.
We then use these models to predict their future performance on a given test.
arXiv Detail & Related papers (2023-04-11T14:46:38Z) - Enhancing Semantic Code Search with Multimodal Contrastive Learning and
Soft Data Augmentation [50.14232079160476]
We propose a new approach with multimodal contrastive learning and soft data augmentation for code search.
We conduct extensive experiments to evaluate the effectiveness of our approach on a large-scale dataset with six programming languages.
arXiv Detail & Related papers (2022-04-07T08:49:27Z) - Learning Multi-Objective Curricula for Deep Reinforcement Learning [55.27879754113767]
Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL)
In this paper, we propose a unified automatic curriculum learning framework to create multi-objective but coherent curricula.
In addition to existing hand-designed curricula paradigms, we further design a flexible memory mechanism to learn an abstract curriculum.
arXiv Detail & Related papers (2021-10-06T19:30:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.