Ensembling Large Language Models to Characterize Affective Dynamics in Student-AI Tutor Dialogues
- URL: http://arxiv.org/abs/2510.13862v1
- Date: Mon, 13 Oct 2025 04:43:56 GMT
- Title: Ensembling Large Language Models to Characterize Affective Dynamics in Student-AI Tutor Dialogues
- Authors: Chenyu Zhang, Sharifa Alghowinem, Cynthia Breazeal,
- Abstract summary: This work introduces the first ensemble-LLM framework for large-scale affect sensing in tutoring dialogues.<n>We analyzed two semesters' worth of 16,986 conversational turns exchanged between PyTutor, an AI tutor, and 261 undergraduate learners across three U.S. institutions.
- Score: 18.497635186707008
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While recent studies have examined the leaning impact of large language model (LLM) in educational contexts, the affective dynamics of LLM-mediated tutoring remain insufficiently understood. This work introduces the first ensemble-LLM framework for large-scale affect sensing in tutoring dialogues, advancing the conversation on responsible pathways for integrating generative AI into education by attending to learners' evolving affective states. To achieve this, we analyzed two semesters' worth of 16,986 conversational turns exchanged between PyTutor, an LLM-powered AI tutor, and 261 undergraduate learners across three U.S. institutions. To investigate learners' emotional experiences, we generate zero-shot affect annotations from three frontier LLMs (Gemini, GPT-4o, Claude), including scalar ratings of valence, arousal, and learning-helpfulness, along with free-text emotion labels. These estimates are fused through rank-weighted intra-model pooling and plurality consensus across models to produce robust emotion profiles. Our analysis shows that during interaction with the AI tutor, students typically report mildly positive affect and moderate arousal. Yet learning is not uniformly smooth: confusion and curiosity are frequent companions to problem solving, and frustration, while less common, still surfaces in ways that can derail progress. Emotional states are short-lived--positive moments last slightly longer than neutral or negative ones, but they are fragile and easily disrupted. Encouragingly, negative emotions often resolve quickly, sometimes rebounding directly into positive states. Neutral moments frequently act as turning points, more often steering students upward than downward, suggesting opportunities for tutors to intervene at precisely these junctures.
Related papers
- Decoding Student Minds: Leveraging Conversational Agents for Psychological and Learning Analysis [0.15293427903448018]
This paper presents a psychologically-aware conversational agent designed to enhance both learning performance and emotional well-being in educational settings.<n>The system combines Large Language Models (LLMs), a knowledge graph-enhanced BERT (KG-BERT), and a bidirectional Long Short-Term Memory (LSTM) with attention to classify students' cognitive and affective states in real time.
arXiv Detail & Related papers (2025-12-11T09:06:45Z) - From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation [14.919711528167054]
This paper demonstrates that targeted activation engineering can steer LLaMA 3.1-8B to exhibit more human-like emotional nuances.<n>Applying these vectors to new conversational prompts significantly enhances emotional characteristics.
arXiv Detail & Related papers (2025-11-16T23:33:06Z) - UCO: A Multi-Turn Interactive Reinforcement Learning Method for Adaptive Teaching with Large Language Models [59.693733170193944]
Large language models (LLMs) are shifting from answer providers to intelligent tutors in educational settings.<n>Recent reinforcement learning approaches address this limitation but face two critical challenges.<n>We propose the Unidirectional Cognitive Optimization (UCO) method to address these challenges.
arXiv Detail & Related papers (2025-11-12T01:27:02Z) - Understanding the Dilemma of Unlearning for Large Language Models [50.54260066313032]
Unlearning seeks to remove specific knowledge from large language models (LLMs)<n>We propose unPact, an interpretable framework for unlearning via prompt attribution and contribution tracking.
arXiv Detail & Related papers (2025-09-29T12:15:19Z) - Incongruent Positivity: When Miscalibrated Positivity Undermines Online Supportive Conversations [0.0]
In emotionally supportive conversations, well-intended positivity can sometimes misfire, leading to responses that feel dismissive, minimizing, or unrealistically optimistic.<n>We examine this phenomenon of incongruent positivity as miscalibrated expressions of positive support in both human and LLM generated responses.<n>Our findings shed light on the need to move beyond merely generating generic positive responses and instead study the congruent support measures to balance positive affect with emotional acknowledgment.
arXiv Detail & Related papers (2025-09-12T12:25:02Z) - MathBuddy: A Multimodal System for Affective Math Tutoring [10.968012903118975]
MathBuddy is an emotionally aware LLM-powered Math Tutor.<n>It maps the student's emotions to relevant pedagogical strategies, making the tutor-student conversation a more empathetic one.<n>We report a massive 23 point performance gain using the win rate and a 3 point gain at an overall level using DAMR scores.
arXiv Detail & Related papers (2025-08-27T15:50:43Z) - RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents [67.46032287312339]
Large language models (LLMs) excel at logical and algorithmic reasoning, yet their emotional intelligence (EQ) still lags far behind their cognitive prowess.<n>We introduce RLVER, the first end-to-end reinforcement learning framework that leverages verifiable emotion rewards from simulated users.<n>Our results show that RLVER is a practical route toward emotionally intelligent and broadly capable language agents.
arXiv Detail & Related papers (2025-07-03T18:33:18Z) - Learning to Focus: Causal Attention Distillation via Gradient-Guided Token Pruning [62.23671919314693]
Large language models (LLMs) have demonstrated significant improvements in contextual understanding.<n>However, their ability to attend to truly critical information during long-context reasoning and generation still falls behind the pace.<n>We introduce a two-stage framework called Learning to Focus (LeaF) to mitigate confounding factors.
arXiv Detail & Related papers (2025-06-09T15:16:39Z) - Conversations: Love Them, Hate Them, Steer Them [10.014248704653]
Large Language Models (LLMs) demonstrate increasing conversational fluency, yet instilling them with nuanced, human-like emotional expression remains a significant challenge.<n>This paper demonstrates that targeted activation engineering can steer LLaMA 3.1-8B to exhibit more human-like emotional nuances.
arXiv Detail & Related papers (2025-05-23T02:58:45Z) - Student-AI Interaction in an LLM-Empowered Learning Environment: A Cluster Analysis of Engagement Profiles [28.794946431719392]
This study explored diverse learner profiles within a multi-agent, LLM-empowered learning environment.<n>Students exhibit varied behavioral, cognitive, and emotional engagement tendencies.
arXiv Detail & Related papers (2025-03-03T16:08:28Z) - Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs [49.18567856499736]
We investigate whether large language models (LLMs) can be supportive of open-ended dialogue tutoring.<n>We apply a range of knowledge tracing (KT) methods on the resulting labeled data to track student knowledge levels over an entire dialogue.<n>We conduct experiments on two tutoring dialogue datasets, and show that a novel yet simple LLM-based method, LLMKT, significantly outperforms existing KT methods in predicting student response correctness in dialogues.
arXiv Detail & Related papers (2024-09-24T22:31:39Z) - Modulating Language Model Experiences through Frictions [56.17593192325438]
Over-consumption of language model outputs risks propagating unchecked errors in the short-term and damaging human capabilities for critical thinking in the long-term.
We propose selective frictions for language model experiences, inspired by behavioral science interventions, to dampen misuse.
arXiv Detail & Related papers (2024-06-24T16:31:11Z) - Opportunities and Challenges in Neural Dialog Tutoring [54.07241332881601]
We rigorously analyze various generative language models on two dialog tutoring datasets for language learning.
We find that although current approaches can model tutoring in constrained learning scenarios, they perform poorly in less constrained scenarios.
Our human quality evaluation shows that both models and ground-truth annotations exhibit low performance in terms of equitable tutoring.
arXiv Detail & Related papers (2023-01-24T11:00:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.