Related papers: Mapping representations in Reinforcement Learning via Semantic Alignment for Zero-Shot Stitching

Mapping representations in Reinforcement Learning via Semantic Alignment for Zero-Shot Stitching

URL: http://arxiv.org/abs/2503.01881v1
Date: Wed, 26 Feb 2025 22:06:00 GMT
Title: Mapping representations in Reinforcement Learning via Semantic Alignment for Zero-Shot Stitching
Authors: Antonio Pio Ricciardi, Valentino Maiorca, Luca Moschella, Riccardo Marin, Emanuele Rodolà,
Abstract summary: Deep Reinforcement Learning models often fail to generalize when even small changes occur in the environment's observations or task requirements.<n>We propose a zero-shot method for mapping between latent spaces across different agents trained on different visual and task variations.<n>We empirically demonstrate zero-shot stitching performance on the CarRacing environment with changing background and task.
Score: 17.76990521486307
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Reinforcement Learning (RL) models often fail to generalize when even small changes occur in the environment's observations or task requirements. Addressing these shifts typically requires costly retraining, limiting the reusability of learned policies. In this paper, we build on recent work in semantic alignment to propose a zero-shot method for mapping between latent spaces across different agents trained on different visual and task variations. Specifically, we learn a transformation that maps embeddings from one agent's encoder to another agent's encoder without further fine-tuning. Our approach relies on a small set of "anchor" observations that are semantically aligned, which we use to estimate an affine or orthogonal transform. Once the transformation is found, an existing controller trained for one domain can interpret embeddings from a different (existing) encoder in a zero-shot fashion, skipping additional trainings. We empirically demonstrate that our framework preserves high performance under visual and task domain shifts. We empirically demonstrate zero-shot stitching performance on the CarRacing environment with changing background and task. By allowing modular re-assembly of existing policies, it paves the way for more robust, compositional RL in dynamically changing environments.

Related papers

Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning [51.177789437682954]
Class-incremental learning (CIL) seeks to enable a model to sequentially learn new classes while retaining knowledge of previously learned ones.<n> Balancing flexibility and stability remains a significant challenge, particularly when the task ID is unknown.<n>We propose a novel semantic drift calibration method that incorporates mean shift compensation and covariance calibration.
arXiv Detail & Related papers (2025-02-11T13:57:30Z)
Analyzing Fine-tuning Representation Shift for Multimodal LLMs Steering alignment [53.90425382758605]
We show how fine-tuning alters the internal structure of a model to specialize in new multimodal tasks. Our work sheds light on how multimodal representations evolve through fine-tuning and offers a new perspective for interpreting model adaptation in multimodal tasks.
arXiv Detail & Related papers (2025-01-06T13:37:13Z)
CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning [17.614980614656407]
We propose Continual Generative training for Incremental prompt-Learning. We exploit Variational Autoencoders to learn class-conditioned distributions. We show that such a generative replay approach can adapt to new tasks while improving zero-shot capabilities.
arXiv Detail & Related papers (2024-07-22T16:51:28Z)
R3L: Relative Representations for Reinforcement Learning [17.76990521486307]
It is known that variations in input domains (e.g., different panorama colors due to seasonal changes) can disrupt agent performance.<n>Recent advancements in the field of representation learning have demonstrated the possibility of combining components to create new models.<n>We adapt this framework to the Visual Reinforcement Learning setting, allowing to combine agents components to create new agents capable of effectively handling novel visual-task pairs.
arXiv Detail & Related papers (2024-04-19T14:42:42Z)
Latent Space Translation via Semantic Alignment [29.2401314068038]
We show how representations learned from different neural modules can be translated between different pre-trained networks. Our method directly estimates a transformation between two given latent spaces, thereby enabling effective stitching of encoders and decoders without additional training. Notably, we show how it is possible to zero-shot stitch text encoders and vision decoders, or vice-versa, yielding surprisingly good classification performance in this multimodal setting.
arXiv Detail & Related papers (2023-11-01T17:12:00Z)
RePo: Resilient Model-Based Reinforcement Learning by Regularizing Posterior Predictability [25.943330238941602]
We propose a visual model-based RL method that learns a latent representation resilient to spurious variations. Our training objective encourages the representation to be maximally predictive of dynamics and reward. Our effort is a step towards making model-based RL a practical and useful tool for dynamic, diverse domains.
arXiv Detail & Related papers (2023-08-31T18:43:04Z)
Bilevel Fast Scene Adaptation for Low-Light Image Enhancement [50.639332885989255]
Enhancing images in low-light scenes is a challenging but widely concerned task in the computer vision. Main obstacle lies in the modeling conundrum from distribution discrepancy across different scenes. We introduce the bilevel paradigm to model the above latent correspondence. A bilevel learning framework is constructed to endow the scene-irrelevant generality of the encoder towards diverse scenes.
arXiv Detail & Related papers (2023-06-02T08:16:21Z)
Homomorphism Autoencoder -- Learning Group Structured Representations from Observed Transitions [51.71245032890532]
We propose methods enabling an agent acting upon the world to learn internal representations of sensory information consistent with actions that modify it. In contrast to existing work, our approach does not require prior knowledge of the group and does not restrict the set of actions the agent can perform.
arXiv Detail & Related papers (2022-07-25T11:22:48Z)
Vision Transformers: From Semantic Segmentation to Dense Prediction [139.15562023284187]
We explore the global context learning potentials of vision transformers (ViTs) for dense visual prediction. Our motivation is that through learning global context at full receptive field layer by layer, ViTs may capture stronger long-range dependency information. We formulate a family of Hierarchical Local-Global (HLG) Transformers, characterized by local attention within windows and global-attention across windows in a pyramidal architecture.
arXiv Detail & Related papers (2022-07-19T15:49:35Z)
AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning [18.269412736181852]
We propose a principled framework for adaptive RL, called AdaRL, that adapts reliably to changes across domains. We show that AdaRL can adapt the policy with only a few samples without further policy optimization in the target domain. We illustrate the efficacy of AdaRL through a series of experiments that allow for changes in different components of Cartpole and Atari games.
arXiv Detail & Related papers (2021-07-06T16:56:25Z)
PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD) We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z)
Unsupervised Controllable Generation with Self-Training [90.04287577605723]
controllable generation with GANs remains a challenging research problem. We propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training. Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder.
arXiv Detail & Related papers (2020-07-17T21:50:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.