Related papers: Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning

Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning

URL: http://arxiv.org/abs/2504.01459v1
Date: Wed, 02 Apr 2025 08:15:16 GMT
Title: Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning
Authors: Llewyn Salt, Marcus Gallagher,
Abstract summary: Reinforcement learning -- algorithms that teach artificial agents to interact with environments by maximising reward signals -- has achieved significant success in recent years.<n>One promising research direction involves introducing goals to allow multimodal policies, commonly through hierarchical or curriculum reinforcement learning.<n>We present a novel probabilistic curriculum learning algorithm to suggest goals for reinforcement learning agents in continuous control and navigation tasks.
Score: 2.5352713493505785
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reinforcement learning (RL) -- algorithms that teach artificial agents to interact with environments by maximising reward signals -- has achieved significant success in recent years. These successes have been facilitated by advances in algorithms (e.g., deep Q-learning, deep deterministic policy gradients, proximal policy optimisation, trust region policy optimisation, and soft actor-critic) and specialised computational resources such as GPUs and TPUs. One promising research direction involves introducing goals to allow multimodal policies, commonly through hierarchical or curriculum reinforcement learning. These methods systematically decompose complex behaviours into simpler sub-tasks, analogous to how humans progressively learn skills (e.g. we learn to run before we walk, or we learn arithmetic before calculus). However, fully automating goal creation remains an open challenge. We present a novel probabilistic curriculum learning algorithm to suggest goals for reinforcement learning agents in continuous control and navigation tasks.

Related papers

Efficient Exploration in Deep Reinforcement Learning: A Novel Bayesian Actor-Critic Algorithm [0.195804735329484]
Reinforcement learning (RL) and Deep Reinforcement Learning (DRL) have the potential to disrupt and are already changing the way we interact with the world. One of the key indicators of their applicability is their ability to scale and work in real-world scenarios.
arXiv Detail & Related papers (2024-08-19T14:50:48Z)
Discovering Temporally-Aware Reinforcement Learning Algorithms [42.016150906831776]
We propose a simple augmentation to two existing objective discovery approaches. We find that commonly used meta-gradient approaches fail to discover adaptive objective functions.
arXiv Detail & Related papers (2024-02-08T17:07:42Z)
Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy. In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks. We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z)
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks [133.40619754674066]
Goal-conditioned reinforcement learning can solve tasks in a wide range of domains, including navigation and manipulation. We propose the distant goal-reaching task by using search at training time to automatically generate intermediate states. E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints.
arXiv Detail & Related papers (2021-10-22T22:05:31Z)
The Information Geometry of Unsupervised Reinforcement Learning [133.20816939521941]
Unsupervised skill discovery is a class of algorithms that learn a set of policies without access to a reward function. We show that unsupervised skill discovery algorithms do not learn skills that are optimal for every possible reward function.
arXiv Detail & Related papers (2021-10-06T13:08:36Z)
MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems. We propose a novel method for computing the normalized maximum likelihood (NML) distribution. We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z)
GRAC: Self-Guided and Self-Regularized Actor-Critic [24.268453994605512]
We propose a self-regularized TD-learning method to address divergence without requiring a target network. We also propose a self-guided policy improvement method by combining policy-gradient with zero-order optimization. This makes learning more robust to local noise in the Q function approximation and guides the updates of our actor network. We evaluate GRAC on the suite of OpenAI gym tasks, achieving or outperforming state of the art in every environment tested.
arXiv Detail & Related papers (2020-09-18T17:58:29Z)
Discovering Reinforcement Learning Algorithms [53.72358280495428]
Reinforcement learning algorithms update an agent's parameters according to one of several possible rules. This paper introduces a new meta-learning approach that discovers an entire update rule. It includes both 'what to predict' (e.g. value functions) and 'how to learn from it' by interacting with a set of environments.
arXiv Detail & Related papers (2020-07-17T07:38:39Z)
Meta-Gradient Reinforcement Learning with an Objective Discovered Online [54.15180335046361]
We propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network. Because the objective is discovered online, it can adapt to changes over time. On the Atari Learning Environment, the meta-gradient algorithm adapts over time to learn with greater efficiency.
arXiv Detail & Related papers (2020-07-16T16:17:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.