CLUTR: Curriculum Learning via Unsupervised Task Representation Learning
- URL: http://arxiv.org/abs/2210.10243v1
- Date: Wed, 19 Oct 2022 01:45:29 GMT
- Title: CLUTR: Curriculum Learning via Unsupervised Task Representation Learning
- Authors: Abdus Salam Azad, Izzeddin Gur, Aleksandra Faust, Pieter Abbeel, and
Ion Stoica
- Abstract summary: CLUTR is a novel curriculum learning algorithm that decouples task representation and curriculum learning into a two-stage optimization.
We show CLUTR outperforms PAIRED, a principled and popular UED method, in terms of generalization and sample efficiency in the challenging CarRacing and navigation environments.
- Score: 130.79246770546413
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement Learning (RL) algorithms are often known for sample
inefficiency and difficult generalization. Recently, Unsupervised Environment
Design (UED) emerged as a new paradigm for zero-shot generalization by
simultaneously learning a task distribution and agent policies on the sampled
tasks. This is a non-stationary process where the task distribution evolves
along with agent policies, creating an instability over time. While past works
demonstrated the potential of such approaches, sampling effectively from the
task space remains an open challenge, bottlenecking these approaches. To this
end, we introduce CLUTR: a novel curriculum learning algorithm that decouples
task representation and curriculum learning into a two-stage optimization. It
first trains a recurrent variational autoencoder on randomly generated tasks to
learn a latent task manifold. Next, a teacher agent creates a curriculum by
maximizing a minimax REGRET-based objective on a set of latent tasks sampled
from this manifold. By keeping the task manifold fixed, we show that CLUTR
successfully overcomes the non-stationarity problem and improves stability. Our
experimental results show CLUTR outperforms PAIRED, a principled and popular
UED method, in terms of generalization and sample efficiency in the challenging
CarRacing and navigation environments: showing an 18x improvement on the F1
CarRacing benchmark. CLUTR also performs comparably to the non-UED
state-of-the-art for CarRacing, outperforming it in nine of the 20 tracks.
CLUTR also achieves a 33% higher solved rate than PAIRED on a set of 18
out-of-distribution navigation tasks.
Related papers
- Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Sample Efficient Reinforcement Learning by Automatically Learning to
Compose Subtasks [3.1594865504808944]
We propose an RL algorithm that automatically structure the reward function for sample efficiency, given a set of labels that signify subtasks.
We evaluate our algorithm in a variety of sparse-reward environments.
arXiv Detail & Related papers (2024-01-25T15:06:40Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - On the Benefit of Optimal Transport for Curriculum Reinforcement Learning [32.59609255906321]
We focus on framing curricula ass between task distributions.
We frame the generation of a curriculum as a constrained optimal transport problem.
Benchmarks show that this way of curriculum generation can improve upon existing CRL methods.
arXiv Detail & Related papers (2023-09-25T12:31:37Z) - Visual Exemplar Driven Task-Prompting for Unified Perception in
Autonomous Driving [100.3848723827869]
We present an effective multi-task framework, VE-Prompt, which introduces visual exemplars via task-specific prompting.
Specifically, we generate visual exemplars based on bounding boxes and color-based markers, which provide accurate visual appearances of target categories.
We bridge transformer-based encoders and convolutional layers for efficient and accurate unified perception in autonomous driving.
arXiv Detail & Related papers (2023-03-03T08:54:06Z) - ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z) - Reinforcement Learning with Success Induced Task Prioritization [68.8204255655161]
We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning.
The algorithm selects the order of tasks that provide the fastest learning for agents.
We demonstrate that SITP matches or surpasses the results of other curriculum design methods.
arXiv Detail & Related papers (2022-12-30T12:32:43Z) - Curriculum Reinforcement Learning using Optimal Transport via Gradual
Domain Adaptation [46.103426976842336]
Reinforcement Learning (CRL) aims to create a sequence of tasks, starting from easy ones and gradually learning towards difficult tasks.
In this work, we focus on the idea of framing CRL as Curriculums between a source (auxiliary) and a target task distribution.
Inspired by the insights from gradual domain adaptation in semi-supervised learning, we create a natural curriculum by breaking down the potentially large task distributional shift in CRL into smaller shifts.
arXiv Detail & Related papers (2022-10-18T22:33:33Z) - Variational Automatic Curriculum Learning for Sparse-Reward Cooperative
Multi-Agent Problems [42.973910399533054]
We introduce a curriculum learning algorithm, Variational Automatic Curriculum Learning (VACL), for solving cooperative multi-agent reinforcement learning problems.
Our VACL algorithm implements this variational paradigm with two practical components, task expansion and entity progression.
Experiment results show that VACL solves a collection of sparse-reward problems with a large number of agents.
arXiv Detail & Related papers (2021-11-08T16:35:08Z) - Trying AGAIN instead of Trying Longer: Prior Learning for Automatic
Curriculum Learning [39.489869446313065]
A major challenge in the Deep RL (DRL) community is to train agents able to generalize over unseen situations.
We propose a two stage ACL approach where 1) a teacher algorithm first learns to train a DRL agent with a high-exploration curriculum, and then 2) distills learned priors from the first run to generate an "expert curriculum"
Besides demonstrating 50% improvements on average over the current state of the art, the objective of this work is to give a first example of a new research direction oriented towards refining ACL techniques over multiple learners.
arXiv Detail & Related papers (2020-04-07T07:30:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.