Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations
- URL: http://arxiv.org/abs/2407.20651v3
- Date: Wed, 2 Oct 2024 06:32:21 GMT
- Title: Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations
- Authors: Yupei Yang, Biwei Huang, Fan Feng, Xinyue Wang, Shikui Tu, Lei Xu,
- Abstract summary: General intelligence requires quick adaption across tasks.
In this paper, we explore a wider range of scenarios where not only the distribution but also the environment spaces may change.
We introduce a causality-guided self-adaptive representation-based approach, called CSR, that equips the agent to generalize effectively.
- Score: 22.6449779859417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: General intelligence requires quick adaption across tasks. While existing reinforcement learning (RL) methods have made progress in generalization, they typically assume only distribution changes between source and target domains. In this paper, we explore a wider range of scenarios where not only the distribution but also the environment spaces may change. For example, in the CoinRun environment, we train agents from easy levels and generalize them to difficulty levels where there could be new enemies that have never occurred before. To address this challenging setting, we introduce a causality-guided self-adaptive representation-based approach, called CSR, that equips the agent to generalize effectively across tasks with evolving dynamics. Specifically, we employ causal representation learning to characterize the latent causal variables within the RL system. Such compact causal representations uncover the structural relationships among variables, enabling the agent to autonomously determine whether changes in the environment stem from distribution shifts or variations in space, and to precisely locate these changes. We then devise a three-step strategy to fine-tune the causal model under different scenarios accordingly. Empirical experiments show that CSR efficiently adapts to the target domains with only a few samples and outperforms state-of-the-art baselines on a wide range of scenarios, including our simulated environments, CartPole, CoinRun and Atari games.
Related papers
- DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design [11.922951794283168]
In this work, we investigate how the sampling of individual environment instances, or levels, affects the zero-shot generalisation (ZSG) ability of RL agents.
We discover that for deep actor-critic architectures sharing their base layers, prioritising levels according to their value loss minimises the mutual information between the agent's internal representation and the set of training levels in the generated training data.
We find that existing UED methods can significantly shift the training distribution, which translates to low ZSG performance.
To prevent both overfitting and distributional shift, we introduce data-regularised environment design (D
arXiv Detail & Related papers (2024-02-05T19:47:45Z) - HAZARD Challenge: Embodied Decision Making in Dynamically Changing
Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind.
This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z) - Invariant Causal Imitation Learning for Generalizable Policies [87.51882102248395]
We propose Invariant Causal Learning (ICIL) to learn an imitation policy.
ICIL learns a representation of causal features that is disentangled from the specific representations of noise variables.
We show that ICIL is effective in learning imitation policies capable of generalizing to unseen environments.
arXiv Detail & Related papers (2023-11-02T16:52:36Z) - AACC: Asymmetric Actor-Critic in Contextual Reinforcement Learning [13.167123175701802]
This paper formalizes the task of adapting to changing environmental dynamics in Reinforcement Learning (RL)
We then propose the Asymmetric Actor-Critic in Contextual RL (AACC) as an end-to-end actor-critic method to deal with such generalization tasks.
We demonstrate the essential improvements in the performance of AACC over existing baselines experimentally in a range of simulated environments.
arXiv Detail & Related papers (2022-08-03T22:52:26Z) - From Big to Small: Adaptive Learning to Partial-Set Domains [94.92635970450578]
Domain adaptation targets at knowledge acquisition and dissemination from a labeled source domain to an unlabeled target domain under distribution shift.
Recent advances show that deep pre-trained models of large scale endow rich knowledge to tackle diverse downstream tasks of small scale.
This paper introduces Partial Domain Adaptation (PDA), a learning paradigm that relaxes the identical class space assumption to that the source class space subsumes the target class space.
arXiv Detail & Related papers (2022-03-14T07:02:45Z) - AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning [18.269412736181852]
We propose a principled framework for adaptive RL, called AdaRL, that adapts reliably to changes across domains.
We show that AdaRL can adapt the policy with only a few samples without further policy optimization in the target domain.
We illustrate the efficacy of AdaRL through a series of experiments that allow for changes in different components of Cartpole and Atari games.
arXiv Detail & Related papers (2021-07-06T16:56:25Z) - LEADS: Learning Dynamical Systems that Generalize Across Environments [12.024388048406587]
We propose LEADS, a novel framework that leverages the commonalities and discrepancies among known environments to improve model generalization.
We show that this new setting can exploit knowledge extracted from environment-dependent data and improves generalization for both known and novel environments.
arXiv Detail & Related papers (2021-06-08T17:28:19Z) - Generalizing Decision Making for Automated Driving with an Invariant
Environment Representation using Deep Reinforcement Learning [55.41644538483948]
Current approaches either do not generalize well beyond the training data or are not capable to consider a variable number of traffic participants.
We propose an invariant environment representation from the perspective of the ego vehicle.
We show that the agents are capable to generalize successfully to unseen scenarios, due to the abstraction.
arXiv Detail & Related papers (2021-02-12T20:37:29Z) - Ecological Reinforcement Learning [76.9893572776141]
We study the kinds of environment properties that can make learning under such conditions easier.
understanding how properties of the environment impact the performance of reinforcement learning agents can help us to structure our tasks in ways that make learning tractable.
arXiv Detail & Related papers (2020-06-22T17:55:03Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.