Related papers: Offline Goal-conditioned Reinforcement Learning with Quasimetric Representations

Offline Goal-conditioned Reinforcement Learning with Quasimetric Representations

URL: http://arxiv.org/abs/2509.20478v1
Date: Wed, 24 Sep 2025 18:45:32 GMT
Title: Offline Goal-conditioned Reinforcement Learning with Quasimetric Representations
Authors: Vivek Myers, Bill Chunyuan Zheng, Benjamin Eysenbach, Sergey Levine,
Abstract summary: Approaches for goal-conditioned reinforcement learning (GCRL) often use learned state representations to extract goal-reaching policies.<n>We propose an approach that unifies these two frameworks, using the structure of a quasimetric representation space (triangle inequality) with the right additional constraints to learn successor representations that enable optimal goal-reaching.<n>Our approach is able to exploit a **quasimetric** distance parameterization to learn **optimal** goal-reaching distances, even with **suboptimal** data and in **stochastic** environments.
Score: 72.24831946301613
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Approaches for goal-conditioned reinforcement learning (GCRL) often use learned state representations to extract goal-reaching policies. Two frameworks for representation structure have yielded particularly effective GCRL algorithms: (1) *contrastive representations*, in which methods learn "successor features" with a contrastive objective that performs inference over future outcomes, and (2) *temporal distances*, which link the (quasimetric) distance in representation space to the transit time from states to goals. We propose an approach that unifies these two frameworks, using the structure of a quasimetric representation space (triangle inequality) with the right additional constraints to learn successor representations that enable optimal goal-reaching. Unlike past work, our approach is able to exploit a **quasimetric** distance parameterization to learn **optimal** goal-reaching distances, even with **suboptimal** data and in **stochastic** environments. This gives us the best of both worlds: we retain the stability and long-horizon capabilities of Monte Carlo contrastive RL methods, while getting the free stitching capabilities of quasimetric network parameterizations. On existing offline GCRL benchmarks, our representation learning objective improves performance on stitching tasks where methods based on contrastive learning struggle, and on noisy, high-dimensional environments where methods based on quasimetric networks struggle.

Related papers

Multistep Quasimetric Learning for Scalable Goal-conditioned Reinforcement Learning [72.24831946301613]
Key question is how to estimate the temporal distance between pairs of observations.<n>We show how these approaches can be integrated into a practical GCRL method that fits a quasimetric distance.<n>We also demonstrate that our method can enable stitching in the real-world robotic manipulation domain.
arXiv Detail & Related papers (2025-11-11T01:28:10Z)
Dual Goal Representations [57.43956630070019]
We introduce dual goal representations for goal-conditioned reinforcement learning (GCRL)<n>A dual goal representation characterizes a state by "the set of temporal distances from all other states"<n>We empirically show that dual goal representations consistently improve offline goal-reaching performance across 20 state- and pixel-based tasks.
arXiv Detail & Related papers (2025-10-08T07:07:39Z)
Equivariant Goal Conditioned Contrastive Reinforcement Learning [5.019456977535218]
Contrastive Reinforcement Learning (CRL) provides a promising framework for extracting useful structured representations from unlabeled interactions.<n>We propose Equivariant CRL, which further structures the latent space using equivariant constraints.<n>Our approach consistently outperforms strong baselines across a range of simulated tasks in both state-based and image-based settings.
arXiv Detail & Related papers (2025-07-22T01:13:45Z)
Topology-Aware CLIP Few-Shot Learning [0.0]
We introduce a topology-aware tuning approach integrating Representation Topology Divergence into the Task Residual framework.<n>By explicitly aligning the topological structures of visual and text representations using a combined RTD and Cross-Entropy loss, our method enhances few-shot performance.
arXiv Detail & Related papers (2025-05-03T04:58:29Z)
Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs. We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z)
Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning [73.80728148866906]
Quasimetric Reinforcement Learning (QRL) is a new RL method that utilizes quasimetric models to learn optimal value functions. On offline and online goal-reaching benchmarks, QRL also demonstrates improved sample efficiency and performance.
arXiv Detail & Related papers (2023-04-03T17:59:58Z)
Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL [49.26825108780872]
Goal-Conditioned Supervised Learning (GCSL) provides a new learning framework by iteratively relabeling and imitating self-generated experiences. We extend GCSL as a novel offline goal-conditioned RL algorithm. We show that WGCSL can consistently outperform GCSL and existing state-of-the-art offline methods.
arXiv Detail & Related papers (2022-02-09T14:17:05Z)
FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization [10.243908145832394]
We study the offline meta-reinforcement learning (OMRL) problem, a paradigm which enables reinforcement learning (RL) algorithms to quickly adapt to unseen tasks. This problem is still not fully understood, for which two major challenges need to be addressed. We provide analysis and insight showing that some simple design choices can yield substantial improvements over recent approaches.
arXiv Detail & Related papers (2020-10-02T17:13:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.