Related papers: Contractive Diffusion Policies: Robust Action Diffusion via Contractive Score-Based Sampling with Differential Equations

Contractive Diffusion Policies: Robust Action Diffusion via Contractive Score-Based Sampling with Differential Equations

URL: http://arxiv.org/abs/2601.01003v1
Date: Fri, 02 Jan 2026 23:33:59 GMT
Title: Contractive Diffusion Policies: Robust Action Diffusion via Contractive Score-Based Sampling with Differential Equations
Authors: Amin Abyaneh, Charlotte Morissette, Mohamad H. Danesh, Anas El Houssaini, David Meger, Gregory Dudek, Hsiu-Chin Lin,
Abstract summary: Contractive Diffusion Policies (CDPs) induce contractive behavior in the diffusion sampling dynamics.<n>CDPs often outperform baseline policies, with pronounced benefits under data scarcity.
Score: 19.77729438305312
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion policies have emerged as powerful generative models for offline policy learning, whose sampling process can be rigorously characterized by a score function guiding a Stochastic Differential Equation (SDE). However, the same score-based SDE modeling that grants diffusion policies the flexibility to learn diverse behavior also incurs solver and score-matching errors, large data requirements, and inconsistencies in action generation. While less critical in image generation, these inaccuracies compound and lead to failure in continuous control settings. We introduce Contractive Diffusion Policies (CDPs) to induce contractive behavior in the diffusion sampling dynamics. Contraction pulls nearby flows closer to enhance robustness against solver and score-matching errors while reducing unwanted action variance. We develop an in-depth theoretical analysis along with a practical implementation recipe to incorporate CDPs into existing diffusion policy architectures with minimal modification and computational cost. We evaluate CDPs for offline learning by conducting extensive experiments in simulation and real-world settings. Across benchmarks, CDPs often outperform baseline policies, with pronounced benefits under data scarcity.

Related papers

Not All Preferences Are Created Equal: Stability-Aware and Gradient-Efficient Alignment for Reasoning Models [52.48582333951919]
We propose a dynamic framework designed to enhance alignment reliability by maximizing the Signal-to-Noise Ratio of policy updates.<n>SAGE (Stability-Aware Gradient Efficiency) integrates a coarse-grained curriculum mechanism that refreshes candidate pools based on model competence.<n> Experiments on multiple mathematical reasoning benchmarks demonstrate that SAGE significantly accelerates convergence and outperforms static baselines.
arXiv Detail & Related papers (2026-02-01T12:56:10Z)
DiffSDA: Unsupervised Diffusion Sequential Disentanglement Across Modalities [27.918193673708473]
We introduce the Diffusion Sequential Disentanglement Autoencoder (DiffSDA), a novel, modal-agnostic framework effective across diverse real-world data modalities.<n>Our experiments on diverse real-world benchmarks demonstrate that DiffSDA outperforms recent state-of-the-art methods in sequential disentanglement.
arXiv Detail & Related papers (2025-10-07T09:30:36Z)
One-Step Flow Policy Mirror Descent [52.31612487608593]
Flow Policy Mirror Descent (FPMD) is an online RL algorithm that enables 1-step sampling during flow policy inference.<n>Our approach exploits a theoretical connection between the distribution variance and the discretization error of single-step sampling in straight flow matching models.
arXiv Detail & Related papers (2025-07-31T15:51:10Z)
On the Interaction of Noise, Compression Role, and Adaptivity under $(L_0, L_1)$-Smoothness: An SDE-based Approach [20.77655203511758]
We study the dynamics of Distributed SGD, Distributed Compressed SGD, and Distributed SignSGD.<n>Our analysis provides insights into the intricate interactions between batch noise, gradient compression, and adaptivity.
arXiv Detail & Related papers (2025-05-30T19:35:15Z)
Generative Latent Neural PDE Solver using Flow Matching [8.397730500554047]
We propose a latent diffusion model for PDE simulation that embeds the PDE state in a lower-dimensional latent space.<n>Our framework uses an autoencoder to map different types of meshes onto a unified structured latent grid, capturing complex geometries.<n> Numerical experiments show that the proposed model outperforms several deterministic baselines in both accuracy and long-term stability.
arXiv Detail & Related papers (2025-03-28T16:44:28Z)
Closure Discovery for Coarse-Grained Partial Differential Equations Using Grid-based Reinforcement Learning [2.9611509639584304]
We propose a systematic approach for identifying closures in under-resolved PDEs using grid-based Reinforcement Learning.<n>We demonstrate the capabilities and limitations of our framework through numerical solutions of the advection equation and the Burgers' equation.
arXiv Detail & Related papers (2024-02-01T19:41:04Z)
Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data. Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z)
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning [70.20191211010847]
Offline reinforcement learning (RL) aims to learn an optimal policy using a previously collected static dataset. We introduce Diffusion Q-learning (Diffusion-QL) that utilizes a conditional diffusion model to represent the policy. We show that our method can achieve state-of-the-art performance on the majority of the D4RL benchmark tasks.
arXiv Detail & Related papers (2022-08-12T09:54:11Z)
Pseudo-Spherical Contrastive Divergence [119.28384561517292]
We propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum learning likelihood of energy-based models. PS-CD avoids the intractable partition function and provides a generalized family of learning objectives.
arXiv Detail & Related papers (2021-11-01T09:17:15Z)
False Correlation Reduction for Offline Reinforcement Learning [115.11954432080749]
We propose falSe COrrelation REduction (SCORE) for offline RL, a practically effective and theoretically provable algorithm. We empirically show that SCORE achieves the SoTA performance with 3.1x acceleration on various tasks in a standard benchmark (D4RL)
arXiv Detail & Related papers (2021-10-24T15:34:03Z)
Implicit Distributional Reinforcement Learning [61.166030238490634]
implicit distributional actor-critic (IDAC) built on two deep generator networks (DGNs) Semi-implicit actor (SIA) powered by a flexible policy distribution. We observe IDAC outperforms state-of-the-art algorithms on representative OpenAI Gym environments.
arXiv Detail & Related papers (2020-07-13T02:52:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.