Listening with Language Models: Using LLMs to Collect and Interpret Classroom Feedback
- URL: http://arxiv.org/abs/2508.11707v1
- Date: Wed, 13 Aug 2025 22:53:55 GMT
- Title: Listening with Language Models: Using LLMs to Collect and Interpret Classroom Feedback
- Authors: Sai Siddartha Maram, Ulia Zaman, Magy Seif El-Nasr,
- Abstract summary: Large Language Model (LLM)-powered chatbots can reimagine the classroom feedback process by engaging students in reflective, conversational dialogues.<n>Our findings suggest that LLM-based feedback systems offer richer insights, greater contextual relevance, and higher engagement compared to standard survey tools.
- Score: 14.83267437400996
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Traditional end-of-quarter surveys often fail to provide instructors with timely, detailed, and actionable feedback about their teaching. In this paper, we explore how Large Language Model (LLM)-powered chatbots can reimagine the classroom feedback process by engaging students in reflective, conversational dialogues. Through the design and deployment of a three-part system-PromptDesigner, FeedbackCollector, and FeedbackAnalyzer-we conducted a pilot study across two graduate courses at UC Santa Cruz. Our findings suggest that LLM-based feedback systems offer richer insights, greater contextual relevance, and higher engagement compared to standard survey tools. Instructors valued the system's adaptability, specificity, and ability to support mid-course adjustments, while students appreciated the conversational format and opportunity for elaboration. We conclude by discussing the design implications of using AI to facilitate more meaningful and responsive feedback in higher education.
Related papers
- Personalized and Constructive Feedback for Computer Science Students Using the Large Language Model (LLM) [0.8409304328108455]
This paper investigates the performance of Large Language Models (LLMs) in processing students assessments with predefined rubrics and marking criteria.<n>We aim to leverage the power of existing LLMs for Marking Assessments, Tracking, and Evaluation (LLM-MATE) with personalized feedback to enhance students learning.
arXiv Detail & Related papers (2025-10-13T15:59:30Z) - Automated Feedback on Student-Generated UML and ER Diagrams Using Large Language Models [39.58317527488534]
We introduce DUET (Diamatic & ER Tutor), a prototype of an LLM-based tool.<n>It converts a reference diagram and a student-submitted diagram into a textual representation and provides structured feedback based on the differences.<n>It uses a multi-stage LLM pipeline to compare diagrams and generate reflective feedback.<n>It enables analytical insights for educators, aiming to foster self-directed learning and inform instructional strategies.
arXiv Detail & Related papers (2025-07-31T11:49:01Z) - User Feedback in Human-LLM Dialogues: A Lens to Understand Users But Noisy as a Learning Signal [59.120335322495436]
We analyze user feedback in the user-LLM conversation logs, providing insights into when and why such feedback occurs.<n>Second, we study harvesting learning signals from such implicit user feedback.
arXiv Detail & Related papers (2025-07-30T23:33:29Z) - Playpen: An Environment for Exploring Learning Through Conversational Interaction [84.0413820245725]
We investigate whether Dialogue Games can also serve as a source of feedback signals for learning.<n>We introduce Playpen, an environment for off- and online learning through Dialogue Game self-play.<n>We find that imitation learning through SFT improves performance on unseen instances, but negatively impacts other skills.
arXiv Detail & Related papers (2025-04-11T14:49:33Z) - SEFL: Enhancing Educational Assignment Feedback with LLM Agents [5.191286314473505]
Synthetic Educational Feedback Loops (SEFL) is a synthetic data framework designed to generate data that resembles immediate, on-demand feedback at scale.<n>To get this type of data, two large language models (LLMs) operate in teacher-student roles to simulate assignment completion and formative feedback.<n>We show that SEFL-tuned models outperform both their non-tuned counterparts in feedback quality and an existing baseline.
arXiv Detail & Related papers (2025-02-18T15:09:29Z) - "My Grade is Wrong!": A Contestable AI Framework for Interactive Feedback in Evaluating Student Essays [6.810086342993699]
This paper introduces CAELF, a Contestable AI Empowered LLM Framework for automating interactive feedback.
CAELF allows students to query, challenge, and clarify their feedback by integrating a multi-agent system with computational argumentation.
A case study on 500 critical thinking essays with user studies demonstrates that CAELF significantly improves interactive feedback.
arXiv Detail & Related papers (2024-09-11T17:59:01Z) - Joint Learning of Context and Feedback Embeddings in Spoken Dialogue [3.8673630752805446]
We investigate the possibility of embedding short dialogue contexts and feedback responses in the same representation space using a contrastive learning objective.
Our results show that the model outperforms humans given the same ranking task and that the learned embeddings carry information about the conversational function of feedback responses.
arXiv Detail & Related papers (2024-06-11T14:22:37Z) - Generating Situated Reflection Triggers about Alternative Solution Paths: A Case Study of Generative AI for Computer-Supported Collaborative Learning [3.2721068185888127]
We present a proof-of-concept application to offer students dynamic and contextualized feedback.
Specifically, we augment an Online Programming Exercise bot for a college-level Cloud Computing course with ChatGPT.
We demonstrate that LLMs can be used to generate highly situated reflection triggers that incorporate details of the collaborative discussion happening in context.
arXiv Detail & Related papers (2024-04-28T17:56:14Z) - Improving the Validity of Automatically Generated Feedback via Reinforcement Learning [46.667783153759636]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL)<n>Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z) - System-Level Natural Language Feedback [83.24259100437965]
We show how to use feedback to formalize system-level design decisions in a human-in-the-loop-process.
We conduct two case studies of this approach for improving search query and dialog response generation.
We show the combination of system-level and instance-level feedback brings further gains.
arXiv Detail & Related papers (2023-06-23T16:21:40Z) - Rethinking the Evaluation for Conversational Recommendation in the Era
of Large Language Models [115.7508325840751]
The recent success of large language models (LLMs) has shown great potential to develop more powerful conversational recommender systems (CRSs)
In this paper, we embark on an investigation into the utilization of ChatGPT for conversational recommendation, revealing the inadequacy of the existing evaluation protocol.
We propose an interactive Evaluation approach based on LLMs named iEvaLM that harnesses LLM-based user simulators.
arXiv Detail & Related papers (2023-05-22T15:12:43Z) - Structural Pre-training for Dialogue Comprehension [51.215629336320305]
We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features.
To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives.
Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
arXiv Detail & Related papers (2021-05-23T15:16:54Z) - Learning an Effective Context-Response Matching Model with
Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination.
We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.
Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.