Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks
- URL: http://arxiv.org/abs/2305.10997v1
- Date: Thu, 18 May 2023 14:19:19 GMT
- Title: Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks
- Authors: Saptarshi Nath, Christos Peridis, Eseoghene Ben-Iwhiwhu, Xinran Liu,
Shirin Dora, Cong Liu, Soheil Kolouri, Andrea Soltoggio
- Abstract summary: Lifelong learning agents aim to learn multiple tasks sequentially over a lifetime.
Modulating masks, a specific type of parameter isolation approach, have recently shown promise in both supervised and reinforcement learning.
We show that the parameter isolation mechanism used by modulating masks is particularly suitable for exchanging knowledge among agents in a distributed system of lifelong learners.
- Score: 14.893594209310875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lifelong learning agents aim to learn multiple tasks sequentially over a
lifetime. This involves the ability to exploit previous knowledge when learning
new tasks and to avoid forgetting. Modulating masks, a specific type of
parameter isolation approach, have recently shown promise in both supervised
and reinforcement learning. While lifelong learning algorithms have been
investigated mainly within a single-agent approach, a question remains on how
multiple agents can share lifelong learning knowledge with each other. We show
that the parameter isolation mechanism used by modulating masks is particularly
suitable for exchanging knowledge among agents in a distributed and
decentralized system of lifelong learners. The key idea is that the isolation
of specific task knowledge to specific masks allows agents to transfer only
specific knowledge on-demand, resulting in robust and effective distributed
lifelong learning. We assume fully distributed and asynchronous scenarios with
dynamic agent numbers and connectivity. An on-demand communication protocol
ensures agents query their peers for specific masks to be transferred and
integrated into their policies when facing each task. Experiments indicate that
on-demand mask communication is an effective way to implement distributed
lifelong reinforcement learning and provides a lifelong learning benefit with
respect to distributed RL baselines such as DD-PPO, IMPALA, and PPO+EWC. The
system is particularly robust to connection drops and demonstrates rapid
learning due to knowledge exchange.
Related papers
- Enhancing Multiple Dimensions of Trustworthiness in LLMs via Sparse Activation Control [44.326363467045496]
Large Language Models (LLMs) have become a critical area of research in Reinforcement Learning from Human Feedback (RLHF)
representation engineering offers a new, training-free approach.
This technique leverages semantic features to control the representation of LLM's intermediate hidden states.
It is difficult to encode various semantic contents, like honesty and safety, into a singular semantic feature.
arXiv Detail & Related papers (2024-11-04T08:36:03Z) - Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - Variational Offline Multi-agent Skill Discovery [43.869625428099425]
We propose two novel auto-encoder schemes to simultaneously capture subgroup- and temporal-level abstractions and form multi-agent skills.
Our method can be applied to offline multi-task data, and the discovered subgroup skills can be transferred across relevant tasks without retraining.
arXiv Detail & Related papers (2024-05-26T00:24:46Z) - PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables Parameter-Efficient Transfer Learning [28.353530290015794]
We propose PEMT, a novel parameter-efficient fine-tuning framework based on multi-task transfer learning.
We conduct experiments on a broad range of tasks over 17 datasets.
arXiv Detail & Related papers (2024-02-23T03:59:18Z) - Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement
Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.
We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy.
Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z) - Masked Autoencoders are Efficient Continual Federated Learners [20.856520787551453]
Continual learning should be grounded in unsupervised learning of representations that are shared across clients.
Masked autoencoders for distribution estimation are particularly amenable to this setup.
arXiv Detail & Related papers (2023-06-06T09:38:57Z) - Lifelong Reinforcement Learning with Modulating Masks [16.24639836636365]
Lifelong learning aims to create AI systems that continuously and incrementally learn during a lifetime, similar to biological learning.
Attempts so far have met problems, including catastrophic forgetting, interference among tasks, and the inability to exploit previous knowledge.
We show that lifelong reinforcement learning with modulating masks is a promising approach to lifelong learning, to the composition of knowledge to learn increasingly complex tasks, and to knowledge reuse for efficient and faster learning.
arXiv Detail & Related papers (2022-12-21T15:49:20Z) - Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation [51.21190751266442]
Domain adaptation (DA) tries to tackle the scenarios when the test data does not fully follow the same distribution of the training data.
By learning from large-scale unlabeled samples, self-supervised learning has now become a new trend in deep learning.
We propose a novel textbfSelf-textbfSupervised textbfGraph Neural Network (SSG) to enable more effective inter-task information exchange and knowledge sharing.
arXiv Detail & Related papers (2022-04-08T03:37:56Z) - Modular Adaptive Policy Selection for Multi-Task Imitation Learning
through Task Division [60.232542918414985]
Multi-task learning often suffers from negative transfer, sharing information that should be task-specific.
This is done by using proto-policies as modules to divide the tasks into simple sub-behaviours that can be shared.
We also demonstrate its ability to autonomously divide the tasks into both shared and task-specific sub-behaviours.
arXiv Detail & Related papers (2022-03-28T15:53:17Z) - Fast and Slow Learning of Recurrent Independent Mechanisms [80.38910637873066]
We propose a training framework in which the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks.
An attention mechanism dynamically selects which modules can be adapted to the current task.
We find that meta-learning the modular aspects of the proposed system greatly helps in achieving faster adaptation in a reinforcement learning setup.
arXiv Detail & Related papers (2021-05-18T17:50:32Z) - Transfer Heterogeneous Knowledge Among Peer-to-Peer Teammates: A Model
Distillation Approach [55.83558520598304]
We propose a brand new solution to reuse experiences and transfer value functions among multiple students via model distillation.
We also describe how to design an efficient communication protocol to exploit heterogeneous knowledge.
Our proposed framework, namely Learning and Teaching Categorical Reinforcement, shows promising performance on stabilizing and accelerating learning progress.
arXiv Detail & Related papers (2020-02-06T11:31:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.