Related papers: PS$^2$: Parameterized Control for Fine-Grained Student Proficiency Simulation

PS$^2$: Parameterized Control for Fine-Grained Student Proficiency Simulation

URL: http://arxiv.org/abs/2602.00850v1
Date: Sat, 31 Jan 2026 18:27:56 GMT
Title: PS$^2$: Parameterized Control for Fine-Grained Student Proficiency Simulation
Authors: Ruochen Liu, Zhiyuan Wen, Hao Yan, Jun Yin, Senzhang Wang, Jiannong Cao,
Abstract summary: Student Simulation (PS$2$) is an unsupervised and parameterized model-level framework that simulates students with different proficiencies.<n>PS$2$ achieves finer-grained and consistent proficiency simulation compared to existing baselines.
Score: 37.112666030892115
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding how students with different proficiency levels respond to educational materials is a critical issue within the field of AI for Education. However, acquiring sufficient real student response data for a robust evaluation is often hindered by cost, ethics, and security constraints. Consequently, LLM-based student proficiency simulation, especially prompt-based methods, has emerged as a practical alternative under data-scarce conditions. Despite their promise, current methods still exhibit limited controllability with coarse-grained proficiency representations, high sensitivity to prompt design, and the lack of calibration with academic performance. Therefore, we propose Parameterized Student Proficiency Simulation (PS$^2$), an unsupervised and parameterized model-level framework that simulates students with different proficiencies by interpolating between a strong upper-bound LLM and a weaker, cognitive error-informed lower-bound student LLM via a hybrid ratio. Specifically, the lower-bound model is constructed by fine-tuning the weaker LM to exhibit cognitive errors when responding to educational materials. To ensure alignment with target proficiency levels, PS$^2$ further calibrates the interpolation ratio with academic performance. Experiments on two public datasets demonstrate that PS$^2$ achieves finer-grained and consistent proficiency simulation compared to existing baselines, leading to superior performance in student behavior similarity and item difficulty prediction.

Related papers

Take Out Your Calculators: Estimating the Real Difficulty of Question Items with LLM Student Simulations [36.23612429926861]
We investigate the predictive value of open-source large language models (LLMs) for evaluating the difficulty of math questions for real-world students.<n>We simulate a "classroom" of 4th, 8th, or 12th grade students by prompting the LLM to role-play students of varying proficiency levels.<n>We observe correlations as high as 0.75, 0.76, and 0.82 for grades 4, 8, and 12, respectively.
arXiv Detail & Related papers (2026-01-15T00:25:01Z)
Efficient Uncertainty in LLMs through Evidential Knowledge Distillation [3.864321514889099]
We introduce a novel approach enabling efficient and effective uncertainty estimation in LLMs without sacrificing performance.<n>We distill uncertainty-aware teacher models into compact student models sharing the same architecture but fine-tuned using Low-Rank Adaptation (LoRA)<n> Empirical evaluations on classification datasets demonstrate that such students can achieve comparable or superior predictive and uncertainty quantification performance.
arXiv Detail & Related papers (2025-07-24T12:46:40Z)
SMART: Simulated Students Aligned with Item Response Theory for Question Difficulty Prediction [38.7828715471869]
We present SMART (Simulated Students Aligned with IRT), a novel method for aligning simulated students with instructed ability.<n>We show that SMART outperforms other item difficulty prediction methods by leveraging its improved ability alignment.
arXiv Detail & Related papers (2025-07-07T15:41:38Z)
AdvKT: An Adversarial Multi-Step Training Framework for Knowledge Tracing [64.79967583649407]
Knowledge Tracing (KT) monitors students' knowledge states and simulates their responses to question sequences.<n>Existing KT models typically follow a single-step training paradigm, which leads to significant error accumulation.<n>We propose a novel Adversarial Multi-Step Training Framework for Knowledge Tracing (AdvKT) which focuses on the multi-step KT task.
arXiv Detail & Related papers (2025-04-07T03:31:57Z)
Learning LLM Preference over Intra-Dialogue Pairs: A Framework for Utterance-level Understandings [9.763273544617176]
Large language models (LLMs) have demonstrated remarkable capabilities in handling complex dialogue tasks without requiring use case-specific fine-tuning.<n>In this paper, we introduce a simple yet effective framework to address this challenge.<n>Our approach is specifically designed for per-utterance classification problems, which encompass tasks such as intent detection, dialogue state tracking, and more.
arXiv Detail & Related papers (2025-03-07T17:46:13Z)
Exploring LLM-based Student Simulation for Metacognitive Cultivation [33.346260553878984]
We propose a pipeline for automatically generating and filtering high-quality simulated student agents.<n>Our work paves the way for broader applications in personalized learning and educational assessment.
arXiv Detail & Related papers (2025-02-17T11:12:47Z)
A Systematic Examination of Preference Learning through the Lens of Instruction-Following [83.71180850955679]
We use a novel synthetic data generation pipeline to generate 48,000 instruction unique-following prompts.<n>With our synthetic prompts, we use two preference dataset curation methods - rejection sampling (RS) and Monte Carlo Tree Search (MCTS)<n>Experiments reveal that shared prefixes in preference pairs, as generated by MCTS, provide marginal but consistent improvements.<n>High-contrast preference pairs generally outperform low-contrast pairs; however, combining both often yields the best performance.
arXiv Detail & Related papers (2024-12-18T15:38:39Z)
LLM-based Cognitive Models of Students with Misconceptions [55.29525439159345]
This paper investigates whether Large Language Models (LLMs) can be instruction-tuned to meet this dual requirement. We introduce MalAlgoPy, a novel Python library that generates datasets reflecting authentic student solution patterns. Our insights enhance our understanding of AI-based student models and pave the way for effective adaptive learning systems.
arXiv Detail & Related papers (2024-10-16T06:51:09Z)
Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios. We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples. Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z)
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity [51.476337785345436]
We study a pessimistic variant of Q-learning in the context of finite-horizon Markov decision processes. A variance-reduced pessimistic Q-learning algorithm is proposed to achieve near-optimal sample complexity.
arXiv Detail & Related papers (2022-02-28T15:39:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.