Related papers: A Bandit-Based Approach to Educational Recommender Systems: Contextual Thompson Sampling for Learner Skill Gain Optimization

A Bandit-Based Approach to Educational Recommender Systems: Contextual Thompson Sampling for Learner Skill Gain Optimization

URL: http://arxiv.org/abs/2602.04347v1
Date: Wed, 04 Feb 2026 09:14:53 GMT
Title: A Bandit-Based Approach to Educational Recommender Systems: Contextual Thompson Sampling for Learner Skill Gain Optimization
Authors: Lukas De Kerpel, Arthur Thuy, Dries F. Benoit,
Abstract summary: This paper introduces a method that generates personalized sequences of exercises by selecting, at each step, the exercise most likely to advance a learner's understanding of a targeted skill.<n>Using data from an online mathematics tutoring platform, we find that the approach recommends exercises associated with greater skill improvement and adapts effectively to differences across learners.<n>From an instructional perspective, the framework enables personalized practice at scale, highlights exercises with consistently strong learning value, and helps instructors identify learners who may benefit from additional support.
Score: 0.45880283710344055
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years, instructional practices in Operations Research (OR), Management Science (MS), and Analytics have increasingly shifted toward digital environments, where large and diverse groups of learners make it difficult to provide practice that adapts to individual needs. This paper introduces a method that generates personalized sequences of exercises by selecting, at each step, the exercise most likely to advance a learner's understanding of a targeted skill. The method uses information about the learner and their past performance to guide these choices, and learning progress is measured as the change in estimated skill level before and after each exercise. Using data from an online mathematics tutoring platform, we find that the approach recommends exercises associated with greater skill improvement and adapts effectively to differences across learners. From an instructional perspective, the framework enables personalized practice at scale, highlights exercises with consistently strong learning value, and helps instructors identify learners who may benefit from additional support.

Related papers

Adaptive Learning Systems: Personalized Curriculum Design Using LLM-Powered Analytics [14.157213827899342]
Large language models (LLMs) are revolutionizing the field of education by enabling personalized learning experiences tailored to individual student needs.<n>This paper introduces a framework for Adaptive Learning Systems that leverages LLM-powered analytics for personalized curriculum design.
arXiv Detail & Related papers (2025-07-25T04:36:17Z)
Dynamic Skill Adaptation for Large Language Models [78.31322532135272]
We present Dynamic Skill Adaptation (DSA), an adaptive and dynamic framework to adapt novel and complex skills to Large Language Models (LLMs)<n>For every skill, we utilize LLMs to generate both textbook-like data which contains detailed descriptions of skills for pre-training and exercise-like data which targets at explicitly utilizing the skills to solve problems for instruction-tuning.<n>Experiments on large language models such as LLAMA and Mistral demonstrate the effectiveness of our proposed methods in adapting math reasoning skills and social study skills.
arXiv Detail & Related papers (2024-12-26T22:04:23Z)
RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning. Our proposed method uses reinforcement learning with user intervention signals themselves as rewards. This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z)
Knowledge Tracing Challenge: Optimal Activity Sequencing for Students [0.9814642627359286]
Knowledge tracing is a method used in education to assess and track the acquisition of knowledge by individual learners. We will present the results of the implementation of two Knowledge Tracing algorithms on a newly released dataset as part of the AAAI2023 Global Knowledge Tracing Challenge.
arXiv Detail & Related papers (2023-11-13T16:28:34Z)
Enhancing Digital Health Services: A Machine Learning Approach to Personalized Exercise Goal Setting [8.146832452474777]
This study aims to develop a machine learning algorithm that dynamically updates auto-suggestion exercise goals using retrospective data and realistic behavior trajectory. The deep reinforcement learning algorithm combines deep learning techniques to analyse time series data and infer user exercise behavior.
arXiv Detail & Related papers (2022-04-03T01:19:20Z)
Optimizing piano practice with a utility-based scaffold [59.821144959060305]
A typical part of learning to play the piano is the progression through a series of practice units that focus on individual dimensions of the skill. Because we each learn differently, and because there are many choices for possible piano practice tasks and methods, the set of practice tasks should be dynamically adapted to the human learner. We present a modeling framework to guide the human learner through the learning process by choosing practice modes that have the highest expected utility.
arXiv Detail & Related papers (2021-06-21T14:05:00Z)
Teaching with Commentaries [108.62722733649542]
We propose a flexible teaching framework using commentaries and learned meta-information. We find that commentaries can improve training speed and/or performance. commentaries can be reused when training new models to obtain performance benefits.
arXiv Detail & Related papers (2020-11-05T18:52:46Z)
Meta-learning the Learning Trends Shared Across Tasks [123.10294801296926]
Gradient-based meta-learning algorithms excel at quick adaptation to new tasks with limited data. Existing meta-learning approaches only depend on the current task information during the adaptation. We propose a 'Path-aware' model-agnostic meta-learning approach.
arXiv Detail & Related papers (2020-10-19T08:06:47Z)
Teaching to Learn: Sequential Teaching of Agents with Inner States [20.556373950863247]
We introduce a multi-agent formulation in which learners' inner state may change with the teaching interaction. In order to teach such learners, we propose an optimal control approach that takes the future performance of the learner after teaching into account.
arXiv Detail & Related papers (2020-09-14T07:03:15Z)
Revisiting Meta-Learning as Supervised Learning [69.2067288158133]
We aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning. By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning. This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning.
arXiv Detail & Related papers (2020-02-03T06:13:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.