Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
- URL: http://arxiv.org/abs/2412.03252v1
- Date: Wed, 04 Dec 2024 11:51:50 GMT
- Title: Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
- Authors: Nozomu Masuya, Hiroshi Sato, Koki Yamane, Takuya Kusume, Sho Sakaino, Toshiaki Tsuji,
- Abstract summary: This paper proposes a novel method of data augmentation that is applicable to force control and preserves the advantages of real-world datasets.
Experiment was conducted on bilateral control-based imitation learning using a method of imitation learning equipped with position-force control.
Results showed a maximum 55% increase in success rate from a simple change in speed of real-world reactions.
- Score: 6.277546031193622
- License:
- Abstract: Because imitation learning relies on human demonstrations in hard-to-simulate settings, the inclusion of force control in this method has resulted in a shortage of training data, even with a simple change in speed. Although the field of data augmentation has addressed the lack of data, conventional methods of data augmentation for robot manipulation are limited to simulation-based methods or downsampling for position control. This paper proposes a novel method of data augmentation that is applicable to force control and preserves the advantages of real-world datasets. We applied teaching-playback at variable speeds as real-world data augmentation to increase both the quantity and quality of environmental reactions at variable speeds. An experiment was conducted on bilateral control-based imitation learning using a method of imitation learning equipped with position-force control. We evaluated the effect of real-world data augmentation on two tasks, pick-and-place and wiping, at variable speeds, each from two human demonstrations at fixed speed. The results showed a maximum 55% increase in success rate from a simple change in speed of real-world reactions and improved accuracy along the duration/frequency command by gathering environmental reactions at variable speeds.
Related papers
- Improving generalization of robot locomotion policies via Sharpness-Aware Reinforcement Learning [0.5399800035598186]
Differentiable simulators offer improved sample efficiency through exact gradients, but can be unstable in contact-rich environments.
This paper introduces a novel approach integrating sharpness-aware optimization into gradient-based reinforcement learning algorithms.
arXiv Detail & Related papers (2024-11-29T14:25:54Z) - Self-Supervised Learning of Grasping Arbitrary Objects On-the-Move [8.445514342786579]
This study introduces three fully convolutional neural network (FCN) models to predict static grasp primitive, dynamic grasp primitive, and residual moving velocity error from visual inputs.
The proposed method achieved the highest grasping accuracy and pick-and-place efficiency.
arXiv Detail & Related papers (2024-11-15T02:59:16Z) - Sample-efficient Imitative Multi-token Decision Transformer for Real-world Driving [18.34685506480288]
We propose Sample-efficient Imitative Multi-token Decision Transformer (SimDT)
SimDT introduces multi-token prediction, online imitative learning pipeline and prioritized experience replay to sequence-modelling reinforcement learning.
Results exceed popular imitation and reinforcement learning algorithms both in open-loop and closed-loop settings on Waymax benchmark.
arXiv Detail & Related papers (2024-06-18T14:27:14Z) - Leveraging the Power of Data Augmentation for Transformer-based Tracking [64.46371987827312]
We propose two data augmentation methods customized for tracking.
First, we optimize existing random cropping via a dynamic search radius mechanism and simulation for boundary samples.
Second, we propose a token-level feature mixing augmentation strategy, which enables the model against challenges like background interference.
arXiv Detail & Related papers (2023-09-15T09:18:54Z) - Blur Interpolation Transformer for Real-World Motion from Blur [52.10523711510876]
We propose a encoded blur transformer (BiT) to unravel the underlying temporal correlation in blur.
Based on multi-scale residual Swin transformer blocks, we introduce dual-end temporal supervision and temporally symmetric ensembling strategies.
In addition, we design a hybrid camera system to collect the first real-world dataset of one-to-many blur-sharp video pairs.
arXiv Detail & Related papers (2022-11-21T13:10:10Z) - Output Feedback Tube MPC-Guided Data Augmentation for Robust, Efficient
Sensorimotor Policy Learning [49.05174527668836]
Imitation learning (IL) can generate computationally efficient sensorimotor policies from demonstrations provided by computationally expensive model-based sensing and control algorithms.
In this work, we combine IL with an output feedback robust tube model predictive controller to co-generate demonstrations and a data augmentation strategy to efficiently learn neural network-based sensorimotor policies.
We numerically demonstrate that our method can learn a robust visuomotor policy from a single demonstration--a two-orders of magnitude improvement in demonstration efficiency compared to existing IL methods.
arXiv Detail & Related papers (2022-10-18T19:59:17Z) - Perceiving the World: Question-guided Reinforcement Learning for
Text-based Games [64.11746320061965]
This paper introduces world-perceiving modules, which automatically decompose tasks and prune actions by answering questions about the environment.
We then propose a two-phase training framework to decouple language learning from reinforcement learning, which further improves the sample efficiency.
arXiv Detail & Related papers (2022-03-20T04:23:57Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - Adaptive Gradient Method with Resilience and Momentum [120.83046824742455]
We propose an Adaptive Gradient Method with Resilience and Momentum (AdaRem)
AdaRem adjusts the parameter-wise learning rate according to whether the direction of one parameter changes in the past is aligned with the direction of the current gradient.
Our method outperforms previous adaptive learning rate-based algorithms in terms of the training speed and the test error.
arXiv Detail & Related papers (2020-10-21T14:49:00Z) - Simulation-based reinforcement learning for real-world autonomous driving [9.773015744446067]
We use reinforcement learning in simulation to obtain a driving system controlling a full-size real-world vehicle.
The driving policy takes RGB images from a single camera and their semantic segmentation as input.
We use mostly synthetic data, with labelled real-world data appearing only in the training of the segmentation network.
arXiv Detail & Related papers (2019-11-29T00:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.