Residual Feedback Learning for Contact-Rich Manipulation Tasks with
Uncertainty
- URL: http://arxiv.org/abs/2106.04306v1
- Date: Tue, 8 Jun 2021 13:06:35 GMT
- Title: Residual Feedback Learning for Contact-Rich Manipulation Tasks with
Uncertainty
- Authors: Alireza Ranjbar, Ngo Anh Vien, Hanna Ziesche, Joschka Boedecker,
Gerhard Neumann
- Abstract summary: emphglsrpl offers a formulation to improve existing controllers with reinforcement learning (RL)
We show superior performance of our approach on a contact-rich peg-insertion task under position and orientation uncertainty.
- Score: 22.276925045008788
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: While classic control theory offers state of the art solutions in many
problem scenarios, it is often desired to improve beyond the structure of such
solutions and surpass their limitations. To this end, \emph{\gls{rpl}} offers a
formulation to improve existing controllers with reinforcement learning (RL) by
learning an additive "residual" to the output of a given controller. However,
the applicability of such an approach highly depends on the structure of the
controller. Often, internal feedback signals of the controller limit an RL
algorithm to adequately change the policy and, hence, learn the task. We
propose a new formulation that addresses these limitations by also modifying
the feedback signals to the controller with an RL policy and show superior
performance of our approach on a contact-rich peg-insertion task under position
and orientation uncertainty. In addition, we use a recent impedance control
architecture as control framework and show the difficulties of standard RPL.
Furthermore, we introduce an adaptive curriculum for the given task to
gradually increase the task difficulty in terms of position and orientation
uncertainty. A video showing the results can be found at
https://youtu.be/SAZm_Krze7U .
Related papers
- Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution [51.83951489847344]
In robotics applications, smooth control signals are commonly preferred to reduce system wear and energy efficiency.
In this work, we aim to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution.
Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that yield surprisingly strong performance on continuous control tasks.
arXiv Detail & Related papers (2024-04-05T17:58:37Z) - A Safe Reinforcement Learning Algorithm for Supervisory Control of Power
Plants [7.1771300511732585]
Model-free Reinforcement learning (RL) has emerged as a promising solution for control tasks.
We propose a chance-constrained RL algorithm based on Proximal Policy Optimization for supervisory control.
Our approach achieves the smallest distance of violation and violation rate in a load-follow maneuver for an advanced Nuclear Power Plant design.
arXiv Detail & Related papers (2024-01-23T17:52:49Z) - When does return-conditioned supervised learning work for offline
reinforcement learning? [51.899892382786526]
We study the capabilities and limitations of return-conditioned supervised learning.
We find that RCSL returns the optimal policy under a set of assumptions stronger than those needed for the more traditional dynamic programming-based algorithms.
arXiv Detail & Related papers (2022-06-02T15:05:42Z) - Steady-State Error Compensation in Reference Tracking and Disturbance
Rejection Problems for Reinforcement Learning-Based Control [0.9023847175654602]
Reinforcement learning (RL) is a promising, upcoming topic in automatic control applications.
Initiative action state augmentation (IASA) for actor-critic-based RL controllers is introduced.
This augmentation does not require any expert knowledge, leaving the approach model free.
arXiv Detail & Related papers (2022-01-31T16:29:19Z) - Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
We treat safe optimization-based control strategies as experts in an imitation learning problem.
We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
arXiv Detail & Related papers (2021-02-18T05:11:41Z) - Deep Reinforcement Learning with Embedded LQR Controllers [1.256413718364189]
We introduce a method that integrates LQR control into the action set, allowing generalization and avoiding fixing the computed control in the replay memory.
In all cases, we show that adding LQR control can improve performance, although the effect is more profound if it can be used to augment a discrete action set.
arXiv Detail & Related papers (2021-01-18T17:28:48Z) - Regularizing Action Policies for Smooth Control with Reinforcement
Learning [47.312768123967025]
Conditioning for Action Policy Smoothness (CAPS) is an effective yet intuitive regularization on action policies.
CAPS offers consistent improvement in the smoothness of the learned state-to-action mappings of neural network controllers.
Tested on a real system, improvements in controller smoothness on a quadrotor drone resulted in an almost 80% reduction in power consumption.
arXiv Detail & Related papers (2020-12-11T21:35:24Z) - DDPG++: Striving for Simplicity in Continuous-control Off-Policy
Reinforcement Learning [95.60782037764928]
We show that simple Deterministic Policy Gradient works remarkably well as long as the overestimation bias is controlled.
Second, we pinpoint training instabilities, typical of off-policy algorithms, to the greedy policy update step.
Third, we show that ideas in the propensity estimation literature can be used to importance-sample transitions from replay buffer and update policy to prevent deterioration of performance.
arXiv Detail & Related papers (2020-06-26T20:21:12Z) - Optimal PID and Antiwindup Control Design as a Reinforcement Learning
Problem [3.131740922192114]
We focus on the interpretability of DRL control methods.
In particular, we view linear fixed-structure controllers as shallow neural networks embedded in the actor-critic framework.
arXiv Detail & Related papers (2020-05-10T01:05:26Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.