Learning Context Matters: Measuring and Diagnosing Personalization Gaps in LLM-Based Instructional Design
- URL: http://arxiv.org/abs/2602.04972v1
- Date: Wed, 04 Feb 2026 19:02:28 GMT
- Title: Learning Context Matters: Measuring and Diagnosing Personalization Gaps in LLM-Based Instructional Design
- Authors: Johaun Hatchett, Debshila Basu Mallick, Brittany C. Bradford, Richard G. Baraniuk,
- Abstract summary: We present a framework for measuring and diagnosing how the Learning Context influences instructional strategy selection.<n>Our results show that, while providing the LC induces systematic, measurable changes in instructional decisions, substantial misalignment remains.<n>This analysis, conducted in collaboration with subject matter experts, demonstrates that LC materially shapes LLM instructional planning but does not reliably induce pedagogically appropriate personalization.
- Score: 17.619569737556205
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The adoption of generative AI in education has accelerated dramatically in recent years, with Large Language Models (LLMs) increasingly integrated into learning environments in the hope of providing personalized support that enhances learner engagement and knowledge retention. However, truly personalized support requires access to meaningful Learning Context (LC) regarding who the learner is, what they are trying to understand, and how they are engaging with the material. In this paper, we present a framework for measuring and diagnosing how the LC influences instructional strategy selection in LLM-based tutoring systems. Using psychometrically grounded synthetic learning contexts and a pedagogically grounded decision space, we compare LLM instructional decisions in context-blind and context-aware conditions and quantify their alignment with the pedagogical judgments of subject matter experts. Our results show that, while providing the LC induces systematic, measurable changes in instructional decisions that move LLM policies closer to the subject matter expert policy, substantial misalignment remains. To diagnose this misalignment, we introduce a relevance-impact analysis that reveals which learner characteristics are attended to, ignored, or spuriously influential in LLM instructional decision-making. This analysis, conducted in collaboration with subject matter experts, demonstrates that LC materially shapes LLM instructional planning but does not reliably induce pedagogically appropriate personalization. Our results enable principled evaluation of context-aware LLM systems and provide a foundation for improving personalization through learner characteristic prioritization, pedagogical model tuning, and LC engineering.
Related papers
- Multi-Agent Learning Path Planning via LLMs [10.288666777827578]
This study proposes a novel Multi-Agent Learning Path Planning framework powered by large language models (LLMs)<n>The framework includes three task-specific agents: a learner analytics agent, a path planning agent, and a reflection agent.<n> Experiments conducted on the MOOCX dataset using seven LLMs show that MALPP significantly outperforms baseline models in path quality, knowledge sequence consistency, and cognitive load alignment.
arXiv Detail & Related papers (2026-01-24T07:13:08Z) - Evaluation of LLM-based Explanations for a Learning Analytics Dashboard [1.4794198430835097]
Learning Analytics Dashboards can be a powerful tool to support self-regulated learning in Digital Learning Environments.<n>However, their effectiveness can be affected by the interpretability of the data they provide.<n>We employ a large language model to generate verbal explanations of the data in the dashboard and evaluate it against a standalone dashboard and explanations provided by human teachers.
arXiv Detail & Related papers (2025-11-11T19:36:40Z) - Adaptive Learning Systems: Personalized Curriculum Design Using LLM-Powered Analytics [14.157213827899342]
Large language models (LLMs) are revolutionizing the field of education by enabling personalized learning experiences tailored to individual student needs.<n>This paper introduces a framework for Adaptive Learning Systems that leverages LLM-powered analytics for personalized curriculum design.
arXiv Detail & Related papers (2025-07-25T04:36:17Z) - A Practical Guide for Supporting Formative Assessment and Feedback Using Generative AI [0.0]
Large-language models (LLMs) can help students, teachers, and peers understand "where learners are going," "where learners currently are," and "how to move learners forward"<n>This review provides a comprehensive foundation for integrating LLMs into formative assessment in a pedagogically informed manner.
arXiv Detail & Related papers (2025-05-29T12:52:43Z) - Enhanced Bloom's Educational Taxonomy for Fostering Information Literacy in the Era of Large Language Models [16.31527042425208]
This paper proposes an LLM-driven Bloom's Educational Taxonomy that aims to recognize and evaluate students' information literacy (IL) with Large Language Models (LLMs)<n>The framework delineates the IL corresponding to the cognitive abilities required to use LLM into two distinct stages: Exploration & Action and Creation & Metacognition.
arXiv Detail & Related papers (2025-03-25T08:23:49Z) - Investigating the Zone of Proximal Development of Language Models for In-Context Learning [59.91708683601029]
We introduce a learning analytics framework to analyze the in-context learning (ICL) behavior of large language models (LLMs)<n>We adapt the Zone of Proximal Development (ZPD) theory to ICL, measuring the ZPD of LLMs based on model performance on individual examples.<n>Our findings reveal a series of intricate and multifaceted behaviors of ICL, providing new insights into understanding and leveraging this technique.
arXiv Detail & Related papers (2025-02-10T19:36:21Z) - Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs [49.18567856499736]
We investigate whether large language models (LLMs) can be supportive of open-ended dialogue tutoring.<n>We apply a range of knowledge tracing (KT) methods on the resulting labeled data to track student knowledge levels over an entire dialogue.<n>We conduct experiments on two tutoring dialogue datasets, and show that a novel yet simple LLM-based method, LLMKT, significantly outperforms existing KT methods in predicting student response correctness in dialogues.
arXiv Detail & Related papers (2024-09-24T22:31:39Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - Evaluating and Optimizing Educational Content with Large Language Model Judgments [52.33701672559594]
We use Language Models (LMs) as educational experts to assess the impact of various instructions on learning outcomes.
We introduce an instruction optimization approach in which one LM generates instructional materials using the judgments of another LM as a reward function.
Human teachers' evaluations of these LM-generated worksheets show a significant alignment between the LM judgments and human teacher preferences.
arXiv Detail & Related papers (2024-03-05T09:09:15Z) - Rethinking Machine Unlearning for Large Language Models [85.92660644100582]
We explore machine unlearning in the domain of large language models (LLMs)<n>This initiative aims to eliminate undesirable data influence (e.g., sensitive or illegal information) and the associated model capabilities.
arXiv Detail & Related papers (2024-02-13T20:51:58Z) - Exploring the Cognitive Knowledge Structure of Large Language Models: An
Educational Diagnostic Assessment Approach [50.125704610228254]
Large Language Models (LLMs) have not only exhibited exceptional performance across various tasks, but also demonstrated sparks of intelligence.
Recent studies have focused on assessing their capabilities on human exams and revealed their impressive competence in different domains.
We conduct an evaluation using MoocRadar, a meticulously annotated human test dataset based on Bloom taxonomy.
arXiv Detail & Related papers (2023-10-12T09:55:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.