Self-Augmented Robot Trajectory: Efficient Imitation Learning via Safe Self-augmentation with Demonstrator-annotated Precision
- URL: http://arxiv.org/abs/2509.09893v1
- Date: Thu, 11 Sep 2025 23:10:56 GMT
- Title: Self-Augmented Robot Trajectory: Efficient Imitation Learning via Safe Self-augmentation with Demonstrator-annotated Precision
- Authors: Hanbit Oh, Masaki Murooka, Tomohiro Motoda, Ryoichi Nakajo, Yukiyasu Domae,
- Abstract summary: Self-Augmented Robot Trajectory (SART) is a framework that enables policy learning from a single human demonstration.<n>SART achieves substantially higher success rates than policies trained solely on human-collected demonstrations.
- Score: 2.3548641190233264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Imitation learning is a promising paradigm for training robot agents; however, standard approaches typically require substantial data acquisition -- via numerous demonstrations or random exploration -- to ensure reliable performance. Although exploration reduces human effort, it lacks safety guarantees and often results in frequent collisions -- particularly in clearance-limited tasks (e.g., peg-in-hole) -- thereby, necessitating manual environmental resets and imposing additional human burden. This study proposes Self-Augmented Robot Trajectory (SART), a framework that enables policy learning from a single human demonstration, while safely expanding the dataset through autonomous augmentation. SART consists of two stages: (1) human teaching only once, where a single demonstration is provided and precision boundaries -- represented as spheres around key waypoints -- are annotated, followed by one environment reset; (2) robot self-augmentation, where the robot generates diverse, collision-free trajectories within these boundaries and reconnects to the original demonstration. This design improves the data collection efficiency by minimizing human effort while ensuring safety. Extensive evaluations in simulation and real-world manipulation tasks show that SART achieves substantially higher success rates than policies trained solely on human-collected demonstrations. Video results available at https://sites.google.com/view/sart-il .
Related papers
- Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons [69.87766750714945]
General-purpose robot reward models are typically trained to predict absolute task progress from expert demonstrations.<n>We introduce Robometer, a scalable reward modeling framework that combines intra-trajectory progress supervision with inter-trajectory preference supervision.<n>Robometer is trained with a dual objective: a frame-level progress loss that anchors reward magnitude on expert data, and a trajectory-comparison preference loss that imposes global ordering constraints.
arXiv Detail & Related papers (2026-03-02T17:38:58Z) - Scalable Dexterous Robot Learning with AR-based Remote Human-Robot Interactions [8.111267700755986]
This paper focuses on the scalable robot learning for manipulation in the dexterous robot arm-hand systems.<n>We present a unified framework to address the general manipulation task problem.
arXiv Detail & Related papers (2026-02-07T03:47:21Z) - From Human Hands to Robot Arms: Manipulation Skills Transfer via Trajectory Alignment [36.08997778717271]
Learning diverse manipulation skills for real-world robots is bottlenecked by reliance on costly and hard-to-scale teleoperated demonstrations.<n>We introduce Traj2Action, a novel framework that bridges this embodiment gap by using the 3D trajectory of the operational endpoint as a unified intermediate representation.<n>Our policy first learns to generate a coarse trajectory, which forms a high-level motion plan by leveraging both human and robot data.
arXiv Detail & Related papers (2025-10-01T04:21:12Z) - SOE: Sample-Efficient Robot Policy Self-Improvement via On-Manifold Exploration [58.05143960563826]
On-Manifold Exploration (SOE) is a framework that enhances policy exploration and improvement in robotic manipulation.<n>SOE learns a compact latent representation of task-relevant factors and constrains exploration to the manifold of valid actions.<n>It can be seamlessly integrated with arbitrary policy models as a plug-in module, augmenting exploration without degrading the base policy performance.
arXiv Detail & Related papers (2025-09-23T17:54:47Z) - Dexplore: Scalable Neural Control for Dexterous Manipulation from Reference-Scoped Exploration [58.4036440289082]
Hand-object motion-capture (MoCap) offer large-scale, contact-rich demonstrations and hold promise for dexterous robotic scopes.<n>We introduce Dexplore, a unified single-loop optimization that performs repositories and tracking to learn robot control policies directly from MoCap at scale.
arXiv Detail & Related papers (2025-09-11T17:59:07Z) - DemoDiffusion: One-Shot Human Imitation using pre-trained Diffusion Policy [33.18108154271181]
We propose DemoDiffusion, a simple and scalable method for enabling robots to perform manipulation tasks in natural environments.<n>Our approach is based on two key insights. First, the hand motion in a human demonstration provides a useful prior for the robot's end-effector trajectory.<n>Second, while this retargeted motion captures the overall structure of the task, it may not align well with plausible robot actions in-context.
arXiv Detail & Related papers (2025-06-25T17:59:01Z) - Imitation Learning with Precisely Labeled Human Demonstrations [0.0]
This work builds on prior studies that demonstrate the viability of using hand-held grippers for efficient data collection.<n>We leverage the user's control over the gripper's appearance--specifically by assigning it a unique, easily segmentable color--to enable precise end-effector pose estimation.<n>We show in simulation that precisely labeled human demonstrations on their own allow policies to reach on average 88.1% of the performance of using robot demonstrations.
arXiv Detail & Related papers (2025-04-18T17:12:00Z) - Single-Shot Learning of Stable Dynamical Systems for Long-Horizon Manipulation Tasks [48.54757719504994]
This paper focuses on improving task success rates while reducing the amount of training data needed.
Our approach introduces a novel method that segments long-horizon demonstrations into discrete steps defined by waypoints and subgoals.
We validate our approach through both simulation and real-world experiments, demonstrating effective transfer from simulation to physical robotic platforms.
arXiv Detail & Related papers (2024-10-01T19:49:56Z) - One-Shot Imitation under Mismatched Execution [7.060120660671016]
Human demonstrations are a powerful way to program robots to do long-horizon manipulation tasks.<n> translating these demonstrations into robot-executable actions presents significant challenges due to execution mismatches in movement styles and physical capabilities.<n>We propose RHyME, a novel framework that automatically pairs human and robot trajectories using sequence-level optimal transport cost functions.
arXiv Detail & Related papers (2024-09-10T16:11:57Z) - Semi-Supervised Active Learning for Semantic Segmentation in Unknown
Environments Using Informative Path Planning [27.460481202195012]
Self-supervised and fully supervised active learning methods emerged to improve a robot's vision.
We propose a planning method for semi-supervised active learning of semantic segmentation.
We leverage an adaptive map-based planner guided towards the frontiers of unexplored space with high model uncertainty.
arXiv Detail & Related papers (2023-12-07T16:16:47Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection.
We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.