Information-theoretic Task Selection for Meta-Reinforcement Learning
- URL: http://arxiv.org/abs/2011.01054v2
- Date: Thu, 1 Jul 2021 13:45:56 GMT
- Title: Information-theoretic Task Selection for Meta-Reinforcement Learning
- Authors: Ricardo Luna Gutierrez, Matteo Leonetti
- Abstract summary: In Meta-Reinforcement Learning (meta-RL) an agent is trained on a set of tasks to prepare for and learn faster in new, unseen, but related tasks.
We show that given a set of training tasks, learning can be both faster and more effective if the training tasks are appropriately selected.
We propose a task selection algorithm, Information-Theoretic Task Selection (ITTS), based on information theory.
- Score: 9.69596041242667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Meta-Reinforcement Learning (meta-RL) an agent is trained on a set of
tasks to prepare for and learn faster in new, unseen, but related tasks. The
training tasks are usually hand-crafted to be representative of the expected
distribution of test tasks and hence all used in training. We show that given a
set of training tasks, learning can be both faster and more effective (leading
to better performance in the test tasks), if the training tasks are
appropriately selected. We propose a task selection algorithm,
Information-Theoretic Task Selection (ITTS), based on information theory, which
optimizes the set of tasks used for training in meta-RL, irrespectively of how
they are generated. The algorithm establishes which training tasks are both
sufficiently relevant for the test tasks, and different enough from one
another. We reproduce different meta-RL experiments from the literature and
show that ITTS improves the final performance in all of them.
Related papers
- Data-Efficient and Robust Task Selection for Meta-Learning [1.4557421099695473]
We propose the Data-Efficient and Robust Task Selection (DERTS) algorithm, which can be incorporated into both gradient and metric-based meta-learning algorithms.
DERTS selects weighted subsets of tasks from task pools by minimizing the approximation error of the full gradient of task pools in the meta-training stage.
Unlike existing algorithms, DERTS does not require any architecture modification for training and can handle noisy label data in both the support and query sets.
arXiv Detail & Related papers (2024-05-11T19:47:27Z) - Active Instruction Tuning: Improving Cross-Task Generalization by
Training on Prompt Sensitive Tasks [101.40633115037983]
Instruction tuning (IT) achieves impressive zero-shot generalization results by training large language models (LLMs) on a massive amount of diverse tasks with instructions.
How to select new tasks to improve the performance and generalizability of IT models remains an open question.
We propose active instruction tuning based on prompt uncertainty, a novel framework to identify informative tasks, and then actively tune the models on the selected tasks.
arXiv Detail & Related papers (2023-11-01T04:40:05Z) - Towards Task Sampler Learning for Meta-Learning [37.02030832662183]
Meta-learning aims to learn general knowledge with diverse training tasks conducted from limited data, and then transfer it to new tasks.
It is commonly believed that increasing task diversity will enhance the generalization ability of meta-learning models.
This paper challenges this view through empirical and theoretical analysis.
arXiv Detail & Related papers (2023-07-18T01:53:18Z) - Reinforcement Learning with Success Induced Task Prioritization [68.8204255655161]
We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning.
The algorithm selects the order of tasks that provide the fastest learning for agents.
We demonstrate that SITP matches or surpasses the results of other curriculum design methods.
arXiv Detail & Related papers (2022-12-30T12:32:43Z) - Meta Reinforcement Learning with Successor Feature Based Context [51.35452583759734]
We propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms.
Our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training.
arXiv Detail & Related papers (2022-07-29T14:52:47Z) - Learning Action Translator for Meta Reinforcement Learning on
Sparse-Reward Tasks [56.63855534940827]
This work introduces a novel objective function to learn an action translator among training tasks.
We theoretically verify that the value of the transferred policy with the action translator can be close to the value of the source policy.
We propose to combine the action translator with context-based meta-RL algorithms for better data collection and more efficient exploration during meta-training.
arXiv Detail & Related papers (2022-07-19T04:58:06Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - Meta-Reinforcement Learning for Heuristic Planning [12.462608802359936]
In Meta-Reinforcement Learning (meta-RL) an agent is trained on a set of tasks to prepare for and learn faster in new, unseen, but related tasks.
We show that given a set of training tasks, learning can be both faster and more effective if the training tasks are appropriately selected.
We propose a task selection algorithm, Information-Theoretic Task Selection (ITTS), based on information theory.
arXiv Detail & Related papers (2021-07-06T13:25:52Z) - Adaptive Task Sampling for Meta-Learning [79.61146834134459]
Key idea of meta-learning for few-shot classification is to mimic the few-shot situations faced at test time.
We propose an adaptive task sampling method to improve the generalization performance.
arXiv Detail & Related papers (2020-07-17T03:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.