Related papers: Expert-Free Online Transfer Learning in Multi-Agent Reinforcement Learning

Expert-Free Online Transfer Learning in Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2501.15495v1
Date: Sun, 26 Jan 2025 11:53:18 GMT
Title: Expert-Free Online Transfer Learning in Multi-Agent Reinforcement Learning
Authors: Alberto Castagna,
Abstract summary: Transfer Learning (TL) aims to reduce the learning complexity for an agent dealing with an unfamiliar task.<n>It enables the use of external knowledge from other tasks or agents to enhance a learning process.<n>This is achieved by lowering the amount of new information required by its learning model, resulting in a reduced overall convergence time.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reinforcement Learning (RL) enables an intelligent agent to optimise its performance in a task by continuously taking action from an observed state and receiving a feedback from the environment in form of rewards. RL typically uses tables or linear approximators to map state-action tuples that maximises the reward. Combining RL with deep neural networks (DRL) significantly increases its scalability and enables it to address more complex problems than before. However, DRL also inherits downsides from both RL and deep learning. Despite DRL improves generalisation across similar state-action pairs when compared to simpler RL policy representations like tabular methods, it still requires the agent to adequately explore the state-action space. Additionally, deep methods require more training data, with the volume of data escalating with the complexity and size of the neural network. As a result, deep RL requires a long time to collect enough agent-environment samples and to successfully learn the underlying policy. Furthermore, often even a slight alteration to the task invalidates any previous acquired knowledge. To address these shortcomings, Transfer Learning (TL) has been introduced, which enables the use of external knowledge from other tasks or agents to enhance a learning process. The goal of TL is to reduce the learning complexity for an agent dealing with an unfamiliar task by simplifying the exploration process. This is achieved by lowering the amount of new information required by its learning model, resulting in a reduced overall convergence time...

Related papers

Hybrid Inverse Reinforcement Learning [34.793570631021005]
inverse reinforcement learning approach to imitation learning is a double-edged sword. We propose using hybrid RL -- training on a mixture of online and expert data -- to curtail unnecessary exploration. We derive both model-free and model-based hybrid inverse RL algorithms with strong policy performance guarantees.
arXiv Detail & Related papers (2024-02-13T23:29:09Z)
REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous Manipulation [61.7171775202833]
We introduce an efficient system for learning dexterous manipulation skills withReinforcement learning. The main idea of our approach is the integration of recent advances in sample-efficient RL and replay buffer bootstrapping. Our system completes the real-world training cycle by incorporating learned resets via an imitation-based pickup policy.
arXiv Detail & Related papers (2023-09-06T19:05:31Z)
Learning to Optimize for Reinforcement Learning [58.01132862590378]
Reinforcement learning (RL) is essentially different from supervised learning, and in practice, these learneds do not work well even in simple RL tasks. Agent-gradient distribution is non-independent and identically distributed, leading to inefficient meta-training. We show that, although only trained in toy tasks, our learned can generalize unseen complex tasks in Brax.
arXiv Detail & Related papers (2023-02-03T00:11:02Z)
A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges [38.70863329476517]
Reinforcement Learning (RL) is a popular machine learning paradigm where intelligent agents interact with the environment to fulfill a long-term goal. Despite the encouraging results achieved, the deep neural network-based backbone is widely deemed as a black box that impedes practitioners to trust and employ trained agents in realistic scenarios where high security and reliability are essential. To alleviate this issue, a large volume of literature devoted to shedding light on the inner workings of the intelligent agents has been proposed, by constructing intrinsic interpretability or post-hoc explainability.
arXiv Detail & Related papers (2022-11-12T13:52:06Z)
Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information [49.06422815335159]
Learning to control an agent from data collected offline is vital for real-world applications of reinforcement learning (RL) This paper introduces offline RL benchmarks offering the ability to study this problem. We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time dependent process.
arXiv Detail & Related papers (2022-10-31T22:12:48Z)
Entropy Regularized Reinforcement Learning with Cascading Networks [9.973226671536041]
Deep RL uses neural networks as function approximators. One of the major difficulties of RL is the absence of i.i.d. data. In this work, we challenge the common practices of the (un)supervised learning community of using a fixed neural architecture.
arXiv Detail & Related papers (2022-10-16T10:28:59Z)
Renaissance Robot: Optimal Transport Policy Fusion for Learning Diverse Skills [28.39150937658635]
We propose a post-hoc technique for policy fusion using Optimal Transport theory. This provides an improved weights initialisation of the neural network policy for learning new tasks. Our results show that specialised knowledge can be unified into a "Renaissance agent", allowing for quicker learning of new skills.
arXiv Detail & Related papers (2022-07-03T08:15:41Z)
Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior. The retrieval process is trained to retrieve information from the dataset that may be useful in the current context. We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z)
Transferred Q-learning [79.79659145328856]
We consider $Q$-learning with knowledge transfer, using samples from a target reinforcement learning (RL) task as well as source samples from different but related RL tasks. We propose transfer learning algorithms for both batch and online $Q$-learning with offline source studies.
arXiv Detail & Related papers (2022-02-09T20:08:19Z)
RvS: What is Essential for Offline RL via Supervised Learning? [77.91045677562802]
Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL. In every environment suite we consider simply maximizing likelihood with two-layer feedforward is competitive. They also probe the limits of existing RvS methods, which are comparatively weak on random data.
arXiv Detail & Related papers (2021-12-20T18:55:16Z)
Exploratory State Representation Learning [63.942632088208505]
We propose a new approach called XSRL (eXploratory State Representation Learning) to solve the problems of exploration and SRL in parallel. On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations. On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a $k$-step learning progress bonus to form the objective of a discovery policy.
arXiv Detail & Related papers (2021-09-28T10:11:07Z)
Fractional Transfer Learning for Deep Model-Based Reinforcement Learning [0.966840768820136]
Reinforcement learning (RL) is well known for requiring large amounts of data in order for RL agents to learn to perform complex tasks. Recent progress in model-based RL allows agents to be much more data-efficient. We present a simple alternative approach: fractional transfer learning.
arXiv Detail & Related papers (2021-08-14T12:44:42Z)
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning. We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.