Reinforcement Learning in Education: A Multi-Armed Bandit Approach
- URL: http://arxiv.org/abs/2211.00779v1
- Date: Tue, 1 Nov 2022 22:47:17 GMT
- Title: Reinforcement Learning in Education: A Multi-Armed Bandit Approach
- Authors: Herkulaas Combrink, Vukosi Marivate, Benjamin Rosman
- Abstract summary: Reinforcement leaning solves unsupervised problems where agents move through a state-action-reward loop to maximize the overall reward for the agent.
The aim of this study was to contextualise and simulate the cumulative reward within an environment for an intervention recommendation problem in the education context.
- Score: 12.358921226358133
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Advances in reinforcement learning research have demonstrated the ways in
which different agent-based models can learn how to optimally perform a task
within a given environment. Reinforcement leaning solves unsupervised problems
where agents move through a state-action-reward loop to maximize the overall
reward for the agent, which in turn optimizes the solving of a specific problem
in a given environment. However, these algorithms are designed based on our
understanding of actions that should be taken in a real-world environment to
solve a specific problem. One such problem is the ability to identify,
recommend and execute an action within a system where the users are the
subject, such as in education. In recent years, the use of blended learning
approaches integrating face-to-face learning with online learning in the
education context, has in-creased. Additionally, online platforms used for
education require the automation of certain functions such as the
identification, recommendation or execution of actions that can benefit the
user, in this sense, the student or learner. As promising as these scientific
advances are, there is still a need to conduct research in a variety of
different areas to ensure the successful deployment of these agents within
education systems. Therefore, the aim of this study was to contextualise and
simulate the cumulative reward within an environment for an intervention
recommendation problem in the education context.
Related papers
- I Know How: Combining Prior Policies to Solve New Tasks [17.214443593424498]
Multi-Task Reinforcement Learning aims at developing agents that are able to continually evolve and adapt to new scenarios.
Learning from scratch for each new task is not a viable or sustainable option.
We propose a new framework, I Know How, which provides a common formalization.
arXiv Detail & Related papers (2024-06-14T08:44:51Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Towards Improving Exploration in Self-Imitation Learning using Intrinsic
Motivation [7.489793155793319]
Reinforcement Learning has emerged as a strong alternative to solve optimization tasks efficiently.
The use of these algorithms highly depends on the feedback signals provided by the environment in charge of informing about how good (or bad) the decisions made by the learned agent are.
In this work intrinsic motivation is used to encourage the agent to explore the environment based on its curiosity, whereas imitation learning allows repeating the most promising experiences to accelerate the learning process.
arXiv Detail & Related papers (2022-11-30T09:18:59Z) - L2Explorer: A Lifelong Reinforcement Learning Assessment Environment [49.40779372040652]
Reinforcement learning solutions tend to generalize poorly when exposed to new tasks outside of the data distribution they are trained on.
We introduce a framework for continual reinforcement-learning development and assessment using Lifelong Learning Explorer (L2Explorer)
L2Explorer is a new, Unity-based, first-person 3D exploration environment that can be continuously reconfigured to generate a range of tasks and task variants structured into complex evaluation curricula.
arXiv Detail & Related papers (2022-03-14T19:20:26Z) - Robust Reinforcement Learning via Genetic Curriculum [5.421464476555662]
Genetic curriculum is an algorithm that automatically identifies scenarios in which the agent currently fails and generates an associated curriculum.
Our empirical studies show improvement in robustness over the existing state of the art algorithms, providing training curricula that result in agents being 2 - 8x times less likely to fail.
arXiv Detail & Related papers (2022-02-17T01:14:20Z) - Rethinking Learning Dynamics in RL using Adversarial Networks [79.56118674435844]
We present a learning mechanism for reinforcement learning of closely related skills parameterized via a skill embedding space.
The main contribution of our work is to formulate an adversarial training regime for reinforcement learning with the help of entropy-regularized policy gradient formulation.
arXiv Detail & Related papers (2022-01-27T19:51:09Z) - One Solution is Not All You Need: Few-Shot Extrapolation via Structured
MaxEnt RL [142.36621929739707]
We show that learning diverse behaviors for accomplishing a task can lead to behavior that generalizes to varying environments.
By identifying multiple solutions for the task in a single environment during training, our approach can generalize to new situations.
arXiv Detail & Related papers (2020-10-27T17:41:57Z) - Importance Weighted Policy Learning and Adaptation [89.46467771037054]
We study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning.
The framework is inspired by ideas from the probabilistic inference literature and combines robust off-policy learning with a behavior prior.
Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv Detail & Related papers (2020-09-10T14:16:58Z) - Curriculum Learning for Reinforcement Learning Domains: A Framework and
Survey [53.73359052511171]
Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback.
We present a framework for curriculum learning (CL) in RL, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals.
arXiv Detail & Related papers (2020-03-10T20:41:24Z) - Human AI interaction loop training: New approach for interactive
reinforcement learning [0.0]
Reinforcement Learning (RL) in various decision-making tasks of machine learning provides effective results with an agent learning from a stand-alone reward function.
RL presents unique challenges with large amounts of environment states and action spaces, as well as in the determination of rewards.
Imitation Learning (IL) offers a promising solution for those challenges using a teacher.
arXiv Detail & Related papers (2020-03-09T15:27:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.