Socratic Students: Teaching Language Models to Learn by Asking Questions
- URL: http://arxiv.org/abs/2512.13102v2
- Date: Sat, 20 Dec 2025 05:51:14 GMT
- Title: Socratic Students: Teaching Language Models to Learn by Asking Questions
- Authors: Rajeev Bhatt Ambati, Tianyi Niu, Aashu Singh, Shlok Mishra, Shashank Srivastava, Snigdha Chaturvedi,
- Abstract summary: We show that student-led approaches consistently yield absolute Pass@k improvements of at least 0.5 over static baselines.<n>We train students using Direct Preference Optimization (DPO) with guidance from either self or stronger students.
- Score: 21.491718334670107
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) excel at static interactions, where they answer user queries by retrieving knowledge encoded in their parameters. However, in many real-world settings, such as educational tutoring or medical assistance, relevant information is not directly available and must be actively acquired through dynamic interactions. An interactive agent would recognize its own uncertainty, ask targeted questions, and retain new knowledge efficiently. Prior work has primarily explored effective ways for a teacher to instruct the student, where the teacher identifies student gaps and provides guidance. In this work, we shift the focus to the student and investigate effective strategies to actively query the teacher in seeking useful information. Across math and coding benchmarks, where baseline student models begin with near-zero performance, we show that student-led approaches consistently yield absolute Pass@k improvements of at least 0.5 over static baselines. To improve question quality, we train students using Direct Preference Optimization (DPO) with guidance from either self or stronger students. We find that this guided training enables smaller models to learn how to ask better questions, further enhancing learning efficiency.
Related papers
- Teaching According to Students' Aptitude: Personalized Mathematics Tutoring via Persona-, Memory-, and Forgetting-Aware LLMs [28.594039597149266]
We propose TASA (Teaching According to Students' Aptitude), a student-aware tutoring framework that integrates persona, memory, and forgetting dynamics.<n>Specifically, TASA maintains a structured student persona capturing proficiency profiles and an event memory recording prior learning interactions.<n>By incorporating a continuous forgetting curve with knowledge tracing, TASA dynamically updates each student's mastery state and generates contextually appropriate, difficulty-calibrated questions and explanations.
arXiv Detail & Related papers (2025-11-19T06:28:16Z) - UCO: A Multi-Turn Interactive Reinforcement Learning Method for Adaptive Teaching with Large Language Models [59.693733170193944]
Large language models (LLMs) are shifting from answer providers to intelligent tutors in educational settings.<n>Recent reinforcement learning approaches address this limitation but face two critical challenges.<n>We propose the Unidirectional Cognitive Optimization (UCO) method to address these challenges.
arXiv Detail & Related papers (2025-11-12T01:27:02Z) - Distilling Realizable Students from Unrealizable Teachers [9.968083244726941]
We study policy distillation under privileged information, where a student policy with only partial observations must learn from a teacher with full-state access.<n>Existing approaches either modify the teacher to produce realizable but sub-optimal demonstrations or rely on the student to explore missing information independently.<n>We introduce two methods: (i) an imitation learning approach that adaptively determines when the student should query the teacher for corrections, and (ii) a reinforcement learning approach that selects where to initialize training for efficient exploration.
arXiv Detail & Related papers (2025-05-14T16:45:51Z) - YODA: Teacher-Student Progressive Learning for Language Models [82.0172215948963]
This paper introduces YODA, a teacher-student progressive learning framework.
It emulates the teacher-student education process to improve the efficacy of model fine-tuning.
Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
arXiv Detail & Related papers (2024-01-28T14:32:15Z) - Revealing Networks: Understanding Effective Teacher Practices in
AI-Supported Classrooms using Transmodal Ordered Network Analysis [0.9187505256430948]
The present study uses transmodal ordered network analysis to understand effective teacher practices in relationship to traditional metrics of in-system learning in a mathematics classroom working with AI tutors.
Comparing teacher practices by student learning rates, we find that students with low learning rates exhibited more hint use after monitoring.
Students with low learning rates showed learning behavior similar to their high learning rate peers, achieving repeated correct attempts in the tutor.
arXiv Detail & Related papers (2023-12-17T21:50:02Z) - Responsible Active Learning via Human-in-the-loop Peer Study [88.01358655203441]
We propose a responsible active learning method, namely Peer Study Learning (PSL), to simultaneously preserve data privacy and improve model stability.
We first introduce a human-in-the-loop teacher-student architecture to isolate unlabelled data from the task learner (teacher) on the cloud-side.
During training, the task learner instructs the light-weight active learner which then provides feedback on the active sampling criterion.
arXiv Detail & Related papers (2022-11-24T13:18:27Z) - Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge
Distillation [70.92135839545314]
We propose the dynamic prior knowledge (DPK), which integrates part of teacher's features as the prior knowledge before the feature distillation.
Our DPK makes the performance of the student model positively correlated with that of the teacher model, which means that we can further boost the accuracy of students by applying larger teachers.
arXiv Detail & Related papers (2022-06-13T11:52:13Z) - Know Thy Student: Interactive Learning with Gaussian Processes [11.641731210416102]
Our work proposes a simple diagnosis algorithm which uses Gaussian processes for inferring student-related information, before constructing a teaching dataset.
We study this in the offline reinforcement learning setting where the teacher must provide demonstrations to the student and avoid sending redundant trajectories.
Our experiments highlight the importance of diagosing before teaching and demonstrate how students can learn more efficiently with the help of an interactive teacher.
arXiv Detail & Related papers (2022-04-26T04:43:57Z) - RLTutor: Reinforcement Learning Based Adaptive Tutoring System by
Modeling Virtual Student with Fewer Interactions [10.34673089426247]
We propose a framework for optimizing teaching strategies by constructing a virtual model of the student.
Our results can serve as a buffer between theoretical instructional optimization and practical applications in e-learning systems.
arXiv Detail & Related papers (2021-07-31T15:42:03Z) - Peer-inspired Student Performance Prediction in Interactive Online
Question Pools with Graph Neural Network [56.62345811216183]
We propose a novel approach using Graph Neural Networks (GNNs) to achieve better student performance prediction in interactive online question pools.
Specifically, we model the relationship between students and questions using student interactions to construct the student-interaction-question network.
We evaluate the effectiveness of our approach on a real-world dataset consisting of 104,113 mouse trajectories generated in the problem-solving process of over 4000 students on 1631 questions.
arXiv Detail & Related papers (2020-08-04T14:55:32Z) - Neural Multi-Task Learning for Teacher Question Detection in Online
Classrooms [50.19997675066203]
We build an end-to-end neural framework that automatically detects questions from teachers' audio recordings.
By incorporating multi-task learning techniques, we are able to strengthen the understanding of semantic relations among different types of questions.
arXiv Detail & Related papers (2020-05-16T02:17:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.