Domain Randomization for Robust, Affordable and Effective Closed-loop
Control of Soft Robots
- URL: http://arxiv.org/abs/2303.04136v2
- Date: Thu, 25 Jan 2024 10:31:28 GMT
- Title: Domain Randomization for Robust, Affordable and Effective Closed-loop
Control of Soft Robots
- Authors: Gabriele Tiboni, Andrea Protopapa, Tatiana Tommasi, Giuseppe Averta
- Abstract summary: Soft robots are gaining popularity thanks to their intrinsic safety to contacts and adaptability.
We show how Domain Randomization (DR) can solve this problem by enhancing RL policies for soft robots.
We introduce a novel algorithmic extension to previous adaptive domain randomization methods for the automatic inference of dynamics parameters for deformable objects.
- Score: 10.977130974626668
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Soft robots are gaining popularity thanks to their intrinsic safety to
contacts and adaptability. However, the potentially infinite number of Degrees
of Freedom makes their modeling a daunting task, and in many cases only an
approximated description is available. This challenge makes reinforcement
learning (RL) based approaches inefficient when deployed on a realistic
scenario, due to the large domain gap between models and the real platform. In
this work, we demonstrate, for the first time, how Domain Randomization (DR)
can solve this problem by enhancing RL policies for soft robots with: i)
robustness w.r.t. unknown dynamics parameters; ii) reduced training times by
exploiting drastically simpler dynamic models for learning; iii) better
environment exploration, which can lead to exploitation of environmental
constraints for optimal performance. Moreover, we introduce a novel algorithmic
extension to previous adaptive domain randomization methods for the automatic
inference of dynamics parameters for deformable objects. We provide an
extensive evaluation in simulation on four different tasks and two soft robot
designs, opening interesting perspectives for future research on Reinforcement
Learning for closed-loop soft robot control.
Related papers
- Active Exploration in Bayesian Model-based Reinforcement Learning for Robot Manipulation [8.940998315746684]
We propose a model-based reinforcement learning (RL) approach for robotic arm end-tasks.
We employ Bayesian neural network models to represent, in a probabilistic way, both the belief and information encoded in the dynamic model during exploration.
Our experiments show the advantages of our Bayesian model-based RL approach, with similar quality in the results than relevant alternatives.
arXiv Detail & Related papers (2024-04-02T11:44:37Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - DiAReL: Reinforcement Learning with Disturbance Awareness for Robust
Sim2Real Policy Transfer in Robot Control [0.0]
Delayed Markov decision processes fulfill the Markov property by augmenting the state space of agents with a finite time window of recently committed actions.
We introduce a disturbance-augmented Markov decision process in delayed settings as a novel representation to incorporate disturbance estimation in training on-policy reinforcement learning algorithms.
arXiv Detail & Related papers (2023-06-15T10:11:38Z) - Obstacle Avoidance for Robotic Manipulator in Joint Space via Improved
Proximal Policy Optimization [6.067589886362815]
In this paper, we train a deep neural network via an improved Proximal Policy Optimization (PPO) algorithm to map from task space to joint space for a 6-DoF manipulator.
Since training such a task in real-robot is time-consuming and strenuous, we develop a simulation environment to train the model.
Experimental results showed that using our method, the robot was capable of tracking a single target or reaching multiple targets in unstructured environments.
arXiv Detail & Related papers (2022-10-03T10:21:57Z) - Gradient-Based Trajectory Optimization With Learned Dynamics [80.41791191022139]
We use machine learning techniques to learn a differentiable dynamics model of the system from data.
We show that a neural network can model highly nonlinear behaviors accurately for large time horizons.
In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car.
arXiv Detail & Related papers (2022-04-09T22:07:34Z) - Learning to Walk Autonomously via Reset-Free Quality-Diversity [73.08073762433376]
Quality-Diversity algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills.
Existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions.
This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments.
arXiv Detail & Related papers (2022-04-07T14:07:51Z) - OSCAR: Data-Driven Operational Space Control for Adaptive and Robust
Robot Manipulation [50.59541802645156]
Operational Space Control (OSC) has been used as an effective task-space controller for manipulation.
We propose OSC for Adaptation and Robustness (OSCAR), a data-driven variant of OSC that compensates for modeling errors.
We evaluate our method on a variety of simulated manipulation problems, and find substantial improvements over an array of controller baselines.
arXiv Detail & Related papers (2021-10-02T01:21:38Z) - Model Predictive Actor-Critic: Accelerating Robot Skill Acquisition with
Deep Reinforcement Learning [42.525696463089794]
Model Predictive Actor-Critic (MoPAC) is a hybrid model-based/model-free method that combines model predictive rollouts with policy optimization as to mitigate model bias.
MoPAC guarantees optimal skill learning up to an approximation error and reduces necessary physical interaction with the environment.
arXiv Detail & Related papers (2021-03-25T13:50:24Z) - Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic
Platforms [60.59764170868101]
Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform.
We formulate it as a few-shot meta-learning problem where the goal is to find a model that captures the common structure shared across different robotic platforms.
We experimentally evaluate our framework on a simulated reaching and a real-robot picking task using 400 simulated robots.
arXiv Detail & Related papers (2021-03-05T14:16:20Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z) - Fast Online Adaptation in Robotics through Meta-Learning Embeddings of
Simulated Priors [3.4376560669160385]
In the real world, a robot might encounter any situation starting from motor failures to finding itself in a rocky terrain.
We show that FAMLE allows the robots to adapt to novel damages in significantly fewer time-steps than the baselines.
arXiv Detail & Related papers (2020-03-10T12:37:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.