Related papers: Two-Steps Diffusion Policy for Robotic Manipulation via Genetic Denoising

Two-Steps Diffusion Policy for Robotic Manipulation via Genetic Denoising

URL: http://arxiv.org/abs/2510.21991v1
Date: Fri, 24 Oct 2025 19:52:41 GMT
Title: Two-Steps Diffusion Policy for Robotic Manipulation via Genetic Denoising
Authors: Mateo Clemente, Leo Brunswic, Rui Heng Yang, Xuan Zhao, Yasser Khalil, Haoyu Lei, Amir Rasouli, Yinchuan Li,
Abstract summary: Diffusion models have achieved state-of-the-art results in robotic manipulation by imitating expert demonstrations.<n>We show that by tailoring the denoising process to the specific characteristics of embodied AI tasks, diffusion policies can operate effectively.<n>We propose a population-based sampling strategy, genetic denoising, which enhances both performance and stability.
Score: 22.356276412952738
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models, such as diffusion policy, have achieved state-of-the-art results in robotic manipulation by imitating expert demonstrations. While diffusion models were originally developed for vision tasks like image and video generation, many of their inference strategies have been directly transferred to control domains without adaptation. In this work, we show that by tailoring the denoising process to the specific characteristics of embodied AI tasks -- particularly structured, low-dimensional nature of action distributions -- diffusion policies can operate effectively with as few as 5 neural function evaluations (NFE). Building on this insight, we propose a population-based sampling strategy, genetic denoising, which enhances both performance and stability by selecting denoising trajectories with low out-of-distribution risk. Our method solves challenging tasks with only 2 NFE while improving or matching performance. We evaluate our approach across 14 robotic manipulation tasks from D4RL and Robomimic, spanning multiple action horizons and inference budgets. In over 2 million evaluations, our method consistently outperforms standard diffusion-based policies, achieving up to 20\% performance gains with significantly fewer inference steps.

Related papers

RoDiF: Robust Direct Fine-Tuning of Diffusion Policies with Corrupted Human Feedback [4.908765539565052]
We introduce a Unified Markov Decision Process (MDP) formulation that coherently integrates the diffusion denoising chain with environmental dynamics.<n>We propose RoDiF (Robust Direct Fine-Tuning), a method that explicitly addresses corrupted human preferences.
arXiv Detail & Related papers (2026-01-31T20:17:15Z)
ForeDiffusion: Foresight-Conditioned Diffusion Policy via Future View Construction for Robot Manipulation [37.67451041903772]
Diffusion strategies have advanced visual motor control by progressively denoising high-dimensional action sequences.<n>This paper proposes Foresight-Conditioned Diffusion (ForeDiffusion), by injecting the predicted future view representation into the diffusion process.<n>ForeDiffusion achieves an average success rate of 80% for the overall task, significantly outperforming the existing mainstream diffusion methods by 23% in complex tasks.
arXiv Detail & Related papers (2026-01-19T10:28:42Z)
Dual-Stream Diffusion for World-Model Augmented Vision-Language-Action Model [62.889356203346985]
We propose DUal-STream diffusion (DUST), a world-model augmented VLA framework that handles the modality conflict.<n>DUST achieves up to 6% gains over a standard VLA baseline and implicit world-modeling methods.<n>On real-world tasks with the Franka Research 3, DUST outperforms baselines in success rate by 13%.
arXiv Detail & Related papers (2025-10-31T16:32:12Z)
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation [18.55356623615343]
Off-policy evaluation (OPE) estimates the performance of a target policy using offline data collected from a behavior policy.<n>Existing OPE methods are ineffective for high-dimensional, long-horizon problems.<n>We propose STITCH-OPE, a model-based generative framework that leverages denoising diffusion for long-horizon OPE.
arXiv Detail & Related papers (2025-05-27T06:39:26Z)
Fine-tuning Diffusion Policies with Backpropagation Through Diffusion Timesteps [13.28742762414913]
We introduce NCDPO, a novel framework that reformulates Diffusion Policy as a noise-conditioned deterministic policy.<n>Our experiments demonstrate that NCDPO achieves sample efficiency comparable to Proximal Policy (PPO) when training from scratch.
arXiv Detail & Related papers (2025-05-15T16:33:44Z)
Fast Adaptation with Behavioral Foundation Models [82.34700481726951]
Unsupervised zero-shot reinforcement learning has emerged as a powerful paradigm for pretraining behavioral foundation models.<n>Despite promising results, zero-shot policies are often suboptimal due to errors induced by the unsupervised training process.<n>We propose fast adaptation strategies that search in the low-dimensional task-embedding space of the pre-trained BFM to rapidly improve the performance of its zero-shot policies.
arXiv Detail & Related papers (2025-04-10T16:14:17Z)
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model [54.64088247291416]
A fundamental objective of manipulation policy design is to endow robots to comprehend human instructions, reason about scene cues, and execute generalized actions in dynamic environments.<n>Recent autoregressive vision-language-action (VLA) methods inherit common-sense reasoning capabilities from vision-language models (VLMs) for next action-token prediction.<n>We introduce HybridVLA, a unified framework that absorbs the continuous nature of diffusion-based actions and the contextual reasoning of autoregression.
arXiv Detail & Related papers (2025-03-13T17:59:52Z)
One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step.<n>To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration.<n>Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z)
One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation [80.71541671907426]
OneStep Diffusion Policy (OneDP) is a novel approach that distills knowledge from pre-trained diffusion policies into a single-step action generator. OneDP significantly accelerates response times for robotic control tasks.
arXiv Detail & Related papers (2024-10-28T17:54:31Z)
Score Regularized Policy Optimization through Diffusion Behavior [25.926641622408752]
Recent developments in offline reinforcement learning have uncovered the immense potential of diffusion modeling. We propose to extract an efficient deterministic inference policy from critic models and pretrained diffusion behavior models. Our method boosts action sampling speed by more than 25 times compared with various leading diffusion-based methods in locomotion tasks.
arXiv Detail & Related papers (2023-10-11T08:31:26Z)
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning [70.20191211010847]
Offline reinforcement learning (RL) aims to learn an optimal policy using a previously collected static dataset. We introduce Diffusion Q-learning (Diffusion-QL) that utilizes a conditional diffusion model to represent the policy. We show that our method can achieve state-of-the-art performance on the majority of the D4RL benchmark tasks.
arXiv Detail & Related papers (2022-08-12T09:54:11Z)
Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models [67.78935378952146]
GenRL is a framework for solving sequential decision-making problems. It exploits the combination of reinforcement learning and latent variable generative models. We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training.
arXiv Detail & Related papers (2022-04-18T22:02:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.