Related papers: GRAM: Generalization in Deep RL with a Robust Adaptation Module

GRAM: Generalization in Deep RL with a Robust Adaptation Module

URL: http://arxiv.org/abs/2412.04323v2
Date: Tue, 10 Jun 2025 00:21:29 GMT
Title: GRAM: Generalization in Deep RL with a Robust Adaptation Module
Authors: James Queeney, Xiaoyi Cai, Alexander Schperberg, Radu Corcodel, Mouhacine Benosman, Jonathan P. How,
Abstract summary: In this work, we present a framework for dynamics generalization in deep reinforcement learning.<n>We introduce a robust adaptation module that provides a mechanism for identifying and reacting to both in-distribution and out-of-distribution environment dynamics.<n>Our algorithm GRAM achieves strong generalization performance across in-distribution and out-of-distribution scenarios upon deployment.
Score: 62.662894174616895
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The reliable deployment of deep reinforcement learning in real-world settings requires the ability to generalize across a variety of conditions, including both in-distribution scenarios seen during training as well as novel out-of-distribution scenarios. In this work, we present a framework for dynamics generalization in deep reinforcement learning that unifies these two distinct types of generalization within a single architecture. We introduce a robust adaptation module that provides a mechanism for identifying and reacting to both in-distribution and out-of-distribution environment dynamics, along with a joint training pipeline that combines the goals of in-distribution adaptation and out-of-distribution robustness. Our algorithm GRAM achieves strong generalization performance across in-distribution and out-of-distribution scenarios upon deployment, which we demonstrate through extensive simulation and hardware locomotion experiments on a quadruped robot.

Related papers

CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards [53.36917093757101]
Role-Playing Language Agents (RPLAs) have emerged as a significant application direction for Large Language Models (LLMs)<n>We introduce textbfCogDual, a novel RPLA adopting a textitcognize-then-respond reasoning paradigm.<n>By jointly modeling external situational awareness and internal self-awareness, CogDual generates responses with improved character consistency and contextual alignment.
arXiv Detail & Related papers (2025-07-23T02:26:33Z)
SHIELD: Multi-task Multi-distribution Vehicle Routing Solver with Sparsity and Hierarchy [26.708590440636527]
We introduce SHIELD, a novel model that leverages both sparsity and hierarchy principles.<n>We develop a context-based clustering layer that exploits the presence of hierarchical structures in the problems to produce better local representations.<n>Our results demonstrate the superiority of our approach over existing methods on 9 real-world maps with 16 VRP variants each.
arXiv Detail & Related papers (2025-06-10T03:55:14Z)
Improving Controller Generalization with Dimensionless Markov Decision Processes [6.047438841182958]
We propose a Model-Based approach to increase generalization where both world model and policy are trained in a dimensionless state-action space. We demonstrate the applicability of our method on simulated actuated pendulum and cartpole systems, where policies trained on a single environment are robust to shifts in the distribution of the context.
arXiv Detail & Related papers (2025-04-14T09:08:53Z)
Toward Task Generalization via Memory Augmentation in Meta-Reinforcement Learning [43.69919534800985]
In reinforcement learning (RL), agents often struggle to perform well on tasks that differ from those encountered during training. This limitation presents a challenge to the broader deployment of RL in diverse and dynamic task settings. We introduce memory augmentation, a memory-based RL approach to improve task generalization.
arXiv Detail & Related papers (2025-02-03T17:00:19Z)
Towards Modality Generalization: A Benchmark and Prospective Analysis [68.20973671493203]
This paper introduces Modality Generalization (MG), which focuses on enabling models to generalize to unseen modalities.<n>We propose a comprehensive benchmark featuring multi-modal algorithms and adapt existing methods that focus on generalization.<n>Our work provides a foundation for advancing robust and adaptable multi-modal models, enabling them to handle unseen modalities in realistic scenarios.
arXiv Detail & Related papers (2024-12-24T08:38:35Z)
Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations [22.6449779859417]
General intelligence requires quick adaption across tasks. In this paper, we explore a wider range of scenarios where not only the distribution but also the environment spaces may change. We introduce a causality-guided self-adaptive representation-based approach, called CSR, that equips the agent to generalize effectively.
arXiv Detail & Related papers (2024-07-30T08:48:49Z)
Gaussian Universality in Neural Network Dynamics with Generalized Structured Input Distributions [2.3020018305241337]
We analyze the behavior of a deep learning system trained on inputs modeled as Gaussian mixtures to better simulate more general structured inputs.<n>Under certain standardization schemes, the deep learning model converges toward Gaussian setting behavior, even when the input data follow more complex or real-world distributions.
arXiv Detail & Related papers (2024-05-01T17:10:55Z)
Distribution-Dependent Rates for Multi-Distribution Learning [26.38831409926518]
Recent multi-distribution learning framework tackles this objective in a dynamic interaction with the environment. We provide distribution-dependent guarantees in the MDL regime, that scale with suboptimality gaps and result in superior dependence on the sample size. We devise an adaptive optimistic algorithm, LCB-DR, that showcases enhanced dependence on the gaps, mirroring the contrast between uniform and optimistic allocation in the multi-armed bandit literature.
arXiv Detail & Related papers (2023-12-20T15:50:16Z)
Generalization Through the Lens of Learning Dynamics [11.009483845261958]
A machine learning (ML) system must learn to generalize to novel situations in order to yield accurate predictions at deployment. The impressive generalization performance of deep neural networks has stymied theoreticians. This thesis will study the learning dynamics of deep neural networks in both supervised and reinforcement learning tasks.
arXiv Detail & Related papers (2022-12-11T00:07:24Z)
Distributionally Adaptive Meta Reinforcement Learning [85.17284589483536]
We develop a framework for meta-RL algorithms that behave appropriately under test-time distribution shifts. Our framework centers on an adaptive approach to distributional robustness that trains a population of meta-policies to be robust to varying levels of distribution shift. We show how our framework allows for improved regret under distribution shift, and empirically show its efficacy on simulated robotics problems.
arXiv Detail & Related papers (2022-10-06T17:55:09Z)
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning [137.39196753245105]
We present a new model-based reinforcement learning algorithm that learns a multi-headed dynamics model for dynamics generalization. We incorporate context learning, which encodes dynamics-specific information from past experiences into the context latent vector. Our method exhibits superior zero-shot generalization performance across a variety of control tasks, compared to state-of-the-art RL methods.
arXiv Detail & Related papers (2020-10-26T03:20:42Z)
Generalization Guarantees for Imitation Learning [6.542289202349586]
Control policies from imitation learning can often fail to generalize to novel environments. We present rigorous generalization guarantees for imitation learning by leveraging the Probably Approximately Correct (PAC)-Bayes framework.
arXiv Detail & Related papers (2020-08-05T03:04:13Z)
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning [124.9856253431878]
We decompose the task of learning a global dynamics model into two stages: (a) learning a context latent vector that captures the local dynamics, then (b) predicting the next state conditioned on it. In order to encode dynamics-specific information into the context latent vector, we introduce a novel loss function that encourages the context latent vector to be useful for predicting both forward and backward dynamics. The proposed method achieves superior generalization ability across various simulated robotics and control tasks, compared to existing RL schemes.
arXiv Detail & Related papers (2020-05-14T08:10:54Z)
From Simulation to Real World Maneuver Execution using Deep Reinforcement Learning [69.23334811890919]
Deep Reinforcement Learning has proved to be able to solve many control tasks in different fields, but the behavior of these systems is not always as expected when deployed in real-world scenarios. This is mainly due to the lack of domain adaptation between simulated and real-world data together with the absence of distinction between train and test datasets. We present a system based on multiple environments in which agents are trained simultaneously, evaluating the behavior of the model in different scenarios.
arXiv Detail & Related papers (2020-05-13T14:22:20Z)
Invariant Causal Prediction for Block MDPs [106.63346115341862]
Generalization across environments is critical to the successful application of reinforcement learning algorithms to real-world challenges. We propose a method of invariant prediction to learn model-irrelevance state abstractions (MISA) that generalize to novel observations in the multi-environment setting.
arXiv Detail & Related papers (2020-03-12T21:03:01Z)
Target-Embedding Autoencoders for Supervised Representation Learning [111.07204912245841]
This paper analyzes a framework for improving generalization in a purely supervised setting, where the target space is high-dimensional. We motivate and formalize the general framework of target-embedding autoencoders (TEA) for supervised prediction, learning intermediate latent representations jointly optimized to be both predictable from features as well as predictive of targets.
arXiv Detail & Related papers (2020-01-23T02:37:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.