Learning to Modulate pre-trained Models in RL
- URL: http://arxiv.org/abs/2306.14884v2
- Date: Fri, 27 Oct 2023 17:28:50 GMT
- Title: Learning to Modulate pre-trained Models in RL
- Authors: Thomas Schmied, Markus Hofmarcher, Fabian Paischer, Razvan Pascanu,
Sepp Hochreiter
- Abstract summary: Fine-tuning a pre-trained model often suffers from catastrophic forgetting.
Our study shows that with most fine-tuning approaches, the performance on pre-training tasks deteriorates significantly.
We propose a novel method, Learning-to-Modulate (L2M), that avoids the degradation of learned skills by modulating the information flow of the frozen pre-trained model.
- Score: 22.812215561012874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement Learning (RL) has been successful in various domains like
robotics, game playing, and simulation. While RL agents have shown impressive
capabilities in their specific tasks, they insufficiently adapt to new tasks.
In supervised learning, this adaptation problem is addressed by large-scale
pre-training followed by fine-tuning to new down-stream tasks. Recently,
pre-training on multiple tasks has been gaining traction in RL. However,
fine-tuning a pre-trained model often suffers from catastrophic forgetting.
That is, the performance on the pre-training tasks deteriorates when
fine-tuning on new tasks. To investigate the catastrophic forgetting
phenomenon, we first jointly pre-train a model on datasets from two benchmark
suites, namely Meta-World and DMControl. Then, we evaluate and compare a
variety of fine-tuning methods prevalent in natural language processing, both
in terms of performance on new tasks, and how well performance on pre-training
tasks is retained. Our study shows that with most fine-tuning approaches, the
performance on pre-training tasks deteriorates significantly. Therefore, we
propose a novel method, Learning-to-Modulate (L2M), that avoids the degradation
of learned skills by modulating the information flow of the frozen pre-trained
model via a learnable modulation pool. Our method achieves state-of-the-art
performance on the Continual-World benchmark, while retaining performance on
the pre-training tasks. Finally, to aid future research in this area, we
release a dataset encompassing 50 Meta-World and 16 DMControl tasks.
Related papers
- Model-Based Transfer Learning for Contextual Reinforcement Learning [5.5597941107270215]
We show how to systematically select good tasks to train, maximizing overall performance across a range of tasks.
Key idea behind our approach is to explicitly model the performance loss incurred by transferring a trained model.
We experimentally validate our methods using urban traffic and standard control benchmarks.
arXiv Detail & Related papers (2024-08-08T14:46:01Z) - Controlling Forgetting with Test-Time Data in Continual Learning [15.455400390299593]
Ongoing Continual Learning research provides techniques to overcome catastrophic forgetting of previous information when new knowledge is acquired.
We argue that test-time data hold great information that can be leveraged in a self supervised manner to refresh the model's memory of previous learned tasks.
arXiv Detail & Related papers (2024-06-19T15:56:21Z) - Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Task Arithmetic with LoRA for Continual Learning [0.0]
We propose a novel method to continually train vision models using low-rank adaptation and task arithmetic.
When aided with a small memory of 10 samples per class, our method achieves performance close to full-set finetuning.
arXiv Detail & Related papers (2023-11-04T15:12:24Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Preventing Catastrophic Forgetting in Continual Learning of New Natural
Language Tasks [17.879087904904935]
Multi-Task Learning (MTL) is widely-accepted in Natural Language Processing as a standard technique for learning multiple related tasks in one model.
As systems usually evolve over time, adding a new task to an existing MTL model usually requires retraining the model from scratch on all the tasks.
In this paper, we approach the problem of incrementally expanding MTL models' capability to solve new tasks over time by distilling the knowledge of an already trained model on n tasks into a new one for solving n+1 tasks.
arXiv Detail & Related papers (2023-02-22T00:18:25Z) - SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
for Semantic and Generative Capabilities [76.97949110580703]
We introduce SUPERB-SG, a new benchmark to evaluate pre-trained models across various speech tasks.
We use a lightweight methodology to test the robustness of representations learned by pre-trained models under shifts in data domain.
We also show that the task diversity of SUPERB-SG coupled with limited task supervision is an effective recipe for evaluating the generalizability of model representation.
arXiv Detail & Related papers (2022-03-14T04:26:40Z) - An Empirical Investigation of the Role of Pre-training in Lifelong
Learning [21.995593026269578]
We show that generic pre-training implicitly alleviates the effects of catastrophic forgetting when learning multiple tasks sequentially.
We study this phenomenon by analyzing the loss landscape, finding that pre-trained weights appear to ease forgetting by leading to wider minima.
arXiv Detail & Related papers (2021-12-16T19:00:55Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - Multi-Stage Influence Function [97.19210942277354]
We develop a multi-stage influence function score to track predictions from a finetuned model all the way back to the pretraining data.
We study two different scenarios with the pretrained embeddings fixed or updated in the finetuning tasks.
arXiv Detail & Related papers (2020-07-17T16:03:11Z) - Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less
Forgetting [66.45372974713189]
We propose a recall and learn mechanism, which adopts the idea of multi-task learning and jointly learns pretraining tasks and downstream tasks.
Experiments show that our method achieves state-of-the-art performance on the GLUE benchmark.
We provide open-source RecAdam, which integrates the proposed mechanisms into Adam to facility the NLP community.
arXiv Detail & Related papers (2020-04-27T08:59:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.