Proximal Curriculum for Reinforcement Learning Agents
- URL: http://arxiv.org/abs/2304.12877v1
- Date: Tue, 25 Apr 2023 14:49:34 GMT
- Title: Proximal Curriculum for Reinforcement Learning Agents
- Authors: Georgios Tzannetos, B\'arbara Gomes Ribeiro, Parameswaran Kamalaruban,
Adish Singla
- Abstract summary: We design our curriculum strategy, ProCuRL, inspired by the pedagogical concept of Zone of Proximal Development (ZPD)
ProCuRL captures the intuition that learning progress is maximized when picking tasks that are neither too hard nor too easy for the learner.
- Score: 17.654532900660712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of curriculum design for reinforcement learning (RL)
agents in contextual multi-task settings. Existing techniques on automatic
curriculum design typically require domain-specific hyperparameter tuning or
have limited theoretical underpinnings. To tackle these limitations, we design
our curriculum strategy, ProCuRL, inspired by the pedagogical concept of Zone
of Proximal Development (ZPD). ProCuRL captures the intuition that learning
progress is maximized when picking tasks that are neither too hard nor too easy
for the learner. We mathematically derive ProCuRL by analyzing two simple
learning settings. We also present a practical variant of ProCuRL that can be
directly integrated with deep RL frameworks with minimal hyperparameter tuning.
Experimental results on a variety of domains demonstrate the effectiveness of
our curriculum strategy over state-of-the-art baselines in accelerating the
training process of deep RL agents.
Related papers
- PCGRL+: Scaling, Control and Generalization in Reinforcement Learning Level Generators [2.334978724544296]
Procedural Content Generation via Reinforcement Learning (PCGRL) has been introduced as a means by which controllable designer agents can be trained.
PCGRL offers a unique set of affordances for game designers, but it is constrained by the compute-intensive process of training RL agents.
We implement several PCGRL environments in Jax so that all aspects of learning and simulation happen in parallel on the GPU.
arXiv Detail & Related papers (2024-08-22T16:30:24Z) - Model-Based Transfer Learning for Contextual Reinforcement Learning [5.5597941107270215]
We introduce Model-Based Transfer Learning to solve contextual RL problems.
We show theoretically that the method exhibits sublinear regret in the number of training tasks.
We experimentally validate our methods using urban traffic and standard continuous control benchmarks.
arXiv Detail & Related papers (2024-08-08T14:46:01Z) - Proximal Curriculum with Task Correlations for Deep Reinforcement Learning [25.10619062353793]
We consider curriculum design in contextual multi-task settings where the agent's final performance is measured w.r.t. a target distribution over complex tasks.
We propose a novel curriculum, ProCuRL-Target, that effectively balances the need for selecting tasks that are not too difficult for the agent while progressing the agent's learning toward the target distribution via leveraging task correlations.
arXiv Detail & Related papers (2024-05-03T21:07:54Z) - RL-GPT: Integrating Reinforcement Learning and Code-as-policy [82.1804241891039]
We introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent.
The slow agent analyzes actions suitable for coding, while the fast agent executes coding tasks.
This decomposition effectively focuses each agent on specific tasks, proving highly efficient within our pipeline.
arXiv Detail & Related papers (2024-02-29T16:07:22Z) - Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories.
We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z) - Learning to Optimize for Reinforcement Learning [58.01132862590378]
Reinforcement learning (RL) is essentially different from supervised learning, and in practice, these learneds do not work well even in simple RL tasks.
Agent-gradient distribution is non-independent and identically distributed, leading to inefficient meta-training.
We show that, although only trained in toy tasks, our learned can generalize unseen complex tasks in Brax.
arXiv Detail & Related papers (2023-02-03T00:11:02Z) - Meta Reinforcement Learning with Successor Feature Based Context [51.35452583759734]
We propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms.
Our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training.
arXiv Detail & Related papers (2022-07-29T14:52:47Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Multi-fidelity reinforcement learning framework for shape optimization [0.8258451067861933]
We introduce a controlled transfer learning framework that leverages a multi-fidelity simulation setting.
Our strategy is deployed for an airfoil shape optimization problem at high Reynolds numbers.
Our results demonstrate this framework's applicability to other scientific DRL scenarios.
arXiv Detail & Related papers (2022-02-22T20:44:04Z) - Towards Deployment-Efficient Reinforcement Learning: Lower Bound and
Optimality [141.89413461337324]
Deployment efficiency is an important criterion for many real-world applications of reinforcement learning (RL)
We propose a theoretical formulation for deployment-efficient RL (DE-RL) from an "optimization with constraints" perspective.
arXiv Detail & Related papers (2022-02-14T01:31:46Z) - Hyperparameter Tuning for Deep Reinforcement Learning Applications [0.3553493344868413]
We propose a distributed variable-length genetic algorithm framework to tune hyperparameters for various RL applications.
Our results show that with more generations, optimal solutions that require fewer training episodes and are computationally cheap while being more robust for deployment.
arXiv Detail & Related papers (2022-01-26T20:43:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.