Related papers: A New Representation of Successor Features for Transfer across Dissimilar Environments

A New Representation of Successor Features for Transfer across Dissimilar Environments

URL: http://arxiv.org/abs/2107.08426v1
Date: Sun, 18 Jul 2021 12:37:05 GMT
Title: A New Representation of Successor Features for Transfer across Dissimilar Environments
Authors: Majid Abdolshah, Hung Le, Thommen Karimpanal George, Sunil Gupta, Santu Rana, Svetha Venkatesh
Abstract summary: Many real-world RL problems require transfer among environments with different dynamics. We propose an approach based on successor features in which we model successor feature functions with Gaussian Processes. Our theoretical analysis proves the convergence of this approach as well as the bounded error on modelling successor feature functions.
Score: 60.813074750879615
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transfer in reinforcement learning is usually achieved through generalisation across tasks. Whilst many studies have investigated transferring knowledge when the reward function changes, they have assumed that the dynamics of the environments remain consistent. Many real-world RL problems require transfer among environments with different dynamics. To address this problem, we propose an approach based on successor features in which we model successor feature functions with Gaussian Processes permitting the source successor features to be treated as noisy measurements of the target successor feature function. Our theoretical analysis proves the convergence of this approach as well as the bounded error on modelling successor feature functions with Gaussian Processes in environments with both different dynamics and rewards. We demonstrate our method on benchmark datasets and show that it outperforms current baselines.

Related papers

Transition Transfer $Q$-Learning for Composite Markov Decision Processes [6.337133205762491]
We introduce a novel composite MDP framework where high-dimensional transition dynamics are modeled as the sum of a low-rank component representing shared structure. This relaxes the common assumption of purely low-rank transition models. We introduce UCB-TQL, designed for transfer RL scenarios where multiple tasks share core linear MDP dynamics but diverge along sparse dimensions.
arXiv Detail & Related papers (2025-02-01T19:22:00Z)
Learning Causally Invariant Reward Functions from Diverse Demonstrations [6.351909403078771]
Inverse reinforcement learning methods aim to retrieve the reward function of a Markov decision process based on a dataset of expert demonstrations. This adaptation often exhibits overfitting to the expert data set when a policy is trained on the obtained reward function under distribution shift of the environment dynamics. In this work, we explore a novel regularization approach for inverse reinforcement learning methods based on the causal invariance principle with the goal of improved reward function generalization.
arXiv Detail & Related papers (2024-09-12T12:56:24Z)
Q-value Regularized Transformer for Offline Reinforcement Learning [70.13643741130899]
We propose a Q-value regularized Transformer (QT) to enhance the state-of-the-art in offline reinforcement learning (RL) QT learns an action-value function and integrates a term maximizing action-values into the training loss of Conditional Sequence Modeling (CSM) Empirical evaluations on D4RL benchmark datasets demonstrate the superiority of QT over traditional DP and CSM methods.
arXiv Detail & Related papers (2024-05-27T12:12:39Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Meta-models for transfer learning in source localisation [3.8922067105369154]
This work looks to capture the interdependencies between acoustic emission (AE) experiments (as meta-models) We utilise a Bayesian multilevel approach where a higher level meta-model captures the inter-task relationships. Key contribution is how knowledge of the experimental campaign can be encoded between tasks as well as within tasks.
arXiv Detail & Related papers (2023-05-15T14:02:35Z)
Investigating the role of model-based learning in exploration and transfer [11.652741003589027]
In this paper, we investigate transfer learning in the context of model-based agents. We find that a model-based approach outperforms controlled model-free baselines for transfer learning. Our results show that intrinsic exploration combined with environment models present a viable direction towards agents that are self-supervised and able to generalize to novel reward functions.
arXiv Detail & Related papers (2023-02-08T11:49:58Z)
Multitask Adaptation by Retrospective Exploration with Learned World Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage. The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z)
Functional Space Analysis of Local GAN Convergence [26.985600125290908]
We study the local dynamics of adversarial training in the general functional space. We show how it can be represented as a system of partial differential equations. Our perspective reveals several insights on the practical tricks commonly used to stabilize GANs.
arXiv Detail & Related papers (2021-02-08T18:59:46Z)
Group Equivariant Deep Reinforcement Learning [4.997686360064921]
We propose the use of Equivariant CNNs to train RL agents and study their inductive bias for transformation equivariant Q-value approximation. We demonstrate that equivariant architectures can dramatically enhance the performance and sample efficiency of RL agents in a highly symmetric environment.
arXiv Detail & Related papers (2020-07-01T02:38:48Z)
Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments. We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data. Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph. Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference. Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.