RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation
- URL: http://arxiv.org/abs/2510.17640v2
- Date: Fri, 24 Oct 2025 13:01:37 GMT
- Title: RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation
- Authors: Yuquan Xue, Guanxing Lu, Zhenyu Wu, Chuanrui Zhang, Bofang Jia, Zhengyi Gu, Yansong Tang, Ziwei Wang,
- Abstract summary: Vision-Language-Action models (VLAs) have demonstrated remarkable performance on complex robotic manipulation tasks through imitation learning.<n>We propose an automated OOD data augmentation framework named RESample through exploratory sampling.<n>We conduct extensive experiments on the LIBERO benchmark as well as real-world robotic manipulation tasks, demonstrating that RESample consistently improves the stability and generalization ability of VLA models.
- Score: 35.01431111972725
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vision-Language-Action models (VLAs) have demonstrated remarkable performance on complex robotic manipulation tasks through imitation learning. However, existing imitation learning datasets contain only successful trajectories and lack failure or recovery data, especially for out-of-distribution (OOD) states where the robot deviates from the main policy due to minor perturbations or errors, leading VLA models to struggle with states deviating from the training distribution. To this end, we propose an automated OOD data augmentation framework named RESample through exploratory sampling. Specifically, we first leverage offline reinforcement learning to obtain an action-value network that accurately identifies sub-optimal actions under the current manipulation policy. We further sample potential OOD states from trajectories via rollout, and design an exploratory sampling mechanism that adaptively incorporates these action proxies into the training dataset to ensure efficiency. Subsequently, our framework explicitly encourages the VLAs to recover from OOD states and enhances their robustness against distributional shifts. We conduct extensive experiments on the LIBERO benchmark as well as real-world robotic manipulation tasks, demonstrating that RESample consistently improves the stability and generalization ability of VLA models.
Related papers
- Self-Correcting VLA: Online Action Refinement via Sparse World Imagination [55.982504915794514]
We propose Self-Correcting VLA (SC-VLA), which achieve self-improvement by intrinsically guiding action refinement through sparse imagination.<n>SC-VLA achieve state-of-the-art performance, yielding the highest task throughput with 16% fewer steps and a 9% higher success rate than the best-performing baselines.
arXiv Detail & Related papers (2026-02-25T06:58:06Z) - Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach [78.4812458793128]
We propose textbfTACO, a test-time-scaling framework that applies a lightweight pseudo-count estimator as a high-fidelity verifier of action chunks.<n>Our method resembles the classical anti-exploration principle in offline reinforcement learning (RL), and being gradient-free, it incurs significant computational benefits.
arXiv Detail & Related papers (2025-12-02T14:42:54Z) - Human-in-the-loop Online Rejection Sampling for Robotic Manipulation [55.99788088622936]
Hi-ORS stabilizes value estimation by filtering out negatively rewarded samples during online fine-tuning.<n>Hi-ORS fine-tunes a pi-base policy to master contact-rich manipulation in just 1.5 hours of real-world training.
arXiv Detail & Related papers (2025-10-30T11:53:08Z) - SCOPED: Score-Curvature Out-of-distribution Proximity Evaluator for Diffusion [5.008779702997125]
Out-of-distribution (OOD) detection is essential for reliable deployment of machine learning systems in vision, robotics, reinforcement learning, and beyond.<n>We introduce Score-Curvature Out-of-distribution Proximity Evaluator for Diffusion (SCOPED)<n>SCOPED is computed from a single diffusion model trained once on a diverse dataset, and combines the Jacobian trace and squared norm of the model's score function into a single test statistic.<n>On four vision benchmarks, SCOPED achieves competitive or state-of-the-art precision-recall scores despite its low computational cost.
arXiv Detail & Related papers (2025-10-01T20:54:49Z) - Align-Then-stEer: Adapting the Vision-Language Action Models through Unified Latent Guidance [63.33213516925946]
We introduce textbfAlign-Then-stEer (textttATE), a novel, data-efficient, and plug-and-play adaptation framework.<n>Our work presents a general and lightweight solution that greatly enhances the practicality of deploying VLA models to new robotic platforms and tasks.
arXiv Detail & Related papers (2025-09-02T07:51:59Z) - Model-Based Offline Reinforcement Learning with Reliability-Guaranteed Sequence Modeling [0.0]
Model-based offline reinforcement learning (MORL) aims to learn a policy by exploiting a dynamics model derived from an existing dataset.<n>We propose a new MORL algorithm textbfReliability-guaranteed textbfTransformer (RT), which can eliminate unreliable trajectories.
arXiv Detail & Related papers (2025-02-10T14:08:55Z) - ARES: Auxiliary Range Expansion for Outlier Synthesis [1.7306463705863946]
We propose a novel methodology for OOD detection named Auxiliary Range Expansion for Outlier Synthesis.<n>Various stages consists ARES to ultimately generate valuable OOD-like virtual instances.<n>The energy score-based discriminator is then trained to effectively separate in-distribution data and outlier data.
arXiv Detail & Related papers (2025-01-11T05:44:33Z) - TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies [95.30717188630432]
We introduce visual trace prompting to facilitate VLA models' spatial-temporal awareness for action prediction.<n>We develop a new TraceVLA model by finetuning OpenVLA on our own collected dataset of 150K robot manipulation trajectories.<n>We present a compact VLA model based on 4B Phi-3-Vision, pretrained on the Open-X-Embodiment and finetuned on our dataset.
arXiv Detail & Related papers (2024-12-13T18:40:51Z) - Enhancing OOD Detection Using Latent Diffusion [3.4899193297791054]
Out-of-distribution (OOD) detection is crucial for the reliable deployment of machine learning models in real-world scenarios.<n>Recent efforts have explored using generative models, such as Stable Diffusion, to synthesize outlier data in the pixel space.<n>We propose Outlier-Aware Learning (OAL), a novel framework that generates synthetic OOD training data within the latent space.
arXiv Detail & Related papers (2024-06-24T11:01:43Z) - Model Reprogramming Outperforms Fine-tuning on Out-of-distribution Data in Text-Image Encoders [56.47577824219207]
In this paper, we unveil the hidden costs associated with intrusive fine-tuning techniques.
We introduce a new model reprogramming approach for fine-tuning, which we name Reprogrammer.
Our empirical evidence reveals that Reprogrammer is less intrusive and yields superior downstream models.
arXiv Detail & Related papers (2024-03-16T04:19:48Z) - RODD: A Self-Supervised Approach for Robust Out-of-Distribution
Detection [12.341250124228859]
We propose a simple yet effective generalized OOD detection method independent of out-of-distribution datasets.
Our approach relies on self-supervised feature learning of the training samples, where the embeddings lie on a compact low-dimensional space.
We empirically show that a pre-trained model with self-supervised contrastive learning yields a better model for uni-dimensional feature learning in the latent space.
arXiv Detail & Related papers (2022-04-06T03:05:58Z) - Distributionally Robust Recurrent Decoders with Random Network
Distillation [93.10261573696788]
We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to disregard OOD context during inference.
We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.
arXiv Detail & Related papers (2021-10-25T19:26:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.