Related papers: Zero-Shot Uncertainty-Aware Deployment of Simulation Trained Policies on Real-World Robots

Zero-Shot Uncertainty-Aware Deployment of Simulation Trained Policies on Real-World Robots

URL: http://arxiv.org/abs/2112.05299v1
Date: Fri, 10 Dec 2021 02:13:01 GMT
Title: Zero-Shot Uncertainty-Aware Deployment of Simulation Trained Policies on Real-World Robots
Authors: Krishan Rana, Vibhavari Dasagi, Jesse Haviland, Ben Talbot, MIchael Milford and Niko S\"underhauf
Abstract summary: Deep reinforcement learning (RL) agents tend to make errors when deployed in the real world due to mismatches between the training and execution environments. We propose a novel uncertainty-aware deployment strategy that combines the strengths of deep RL policies and traditional handcrafted controllers. We show promising results on two real-world continuous control tasks, where BCF outperforms both the standalone policy and controller.
Score: 17.710172337571617
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While deep reinforcement learning (RL) agents have demonstrated incredible potential in attaining dexterous behaviours for robotics, they tend to make errors when deployed in the real world due to mismatches between the training and execution environments. In contrast, the classical robotics community have developed a range of controllers that can safely operate across most states in the real world given their explicit derivation. These controllers however lack the dexterity required for complex tasks given limitations in analytical modelling and approximations. In this paper, we propose Bayesian Controller Fusion (BCF), a novel uncertainty-aware deployment strategy that combines the strengths of deep RL policies and traditional handcrafted controllers. In this framework, we can perform zero-shot sim-to-real transfer, where our uncertainty based formulation allows the robot to reliably act within out-of-distribution states by leveraging the handcrafted controller while gaining the dexterity of the learned system otherwise. We show promising results on two real-world continuous control tasks, where BCF outperforms both the standalone policy and controller, surpassing what either can achieve independently. A supplementary video demonstrating our system is provided at https://bit.ly/bcf_deploy.

Related papers

Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers. Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy. We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z)
Robotic Control via Embodied Chain-of-Thought Reasoning [86.6680905262442]
Key limitation of learned robot control policies is their inability to generalize outside their training data. Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models can substantially improve their robustness and generalization ability. We introduce Embodied Chain-of-Thought Reasoning (ECoT) for VLAs, in which we train VLAs to perform multiple steps of reasoning about plans, sub-tasks, motions, and visually grounded features before predicting the robot action.
arXiv Detail & Related papers (2024-07-11T17:31:01Z)
Towards Transferring Tactile-based Continuous Force Control Policies from Simulation to Robot [19.789369416528604]
grasp force control aims to manipulate objects safely by limiting the amount of force exerted on the object. Prior works have either hand-modeled their force controllers, employed model-based approaches, or have not shown sim-to-real transfer. We propose a model-free deep reinforcement learning approach trained in simulation and then transferred to the robot without further fine-tuning.
arXiv Detail & Related papers (2023-11-13T11:29:06Z)
Nonprehensile Planar Manipulation through Reinforcement Learning with Multimodal Categorical Exploration [8.343657309038285]
Reinforcement Learning is a powerful framework for developing such robot controllers. We propose a multimodal exploration approach through categorical distributions, which enables us to train planar pushing RL policies. We show that the learned policies are robust to external disturbances and observation noise, and scale to tasks with multiple pushers.
arXiv Detail & Related papers (2023-08-04T16:55:00Z)
Residual Physics Learning and System Identification for Sim-to-real Transfer of Policies on Buoyancy Assisted Legged Robots [14.760426243769308]
In this work, we demonstrate robust sim-to-real transfer of control policies on the BALLU robots via system identification. Rather than relying on standard supervised learning formulations, we utilize deep reinforcement learning to train an external force policy. We analyze the improved simulation fidelity by comparing the simulation trajectories against the real-world ones.
arXiv Detail & Related papers (2023-03-16T18:49:05Z)
Learning Bipedal Walking for Humanoids with Current Feedback [5.429166905724048]
We present an approach for overcoming the sim2real gap issue for humanoid robots arising from inaccurate torque-tracking at the actuator level. Our approach successfully trains a unified, end-to-end policy in simulation that can be deployed on a real HRP-5P humanoid robot to achieve bipedal locomotion.
arXiv Detail & Related papers (2023-03-07T08:16:46Z)
DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality [64.51295032956118]
We train a policy that can perform robust dexterous manipulation on an anthropomorphic robot hand. Our work reaffirms the possibilities of sim-to-real transfer for dexterous manipulation in diverse kinds of hardware and simulator setups.
arXiv Detail & Related papers (2022-10-25T01:51:36Z)
Active Predicting Coding: Brain-Inspired Reinforcement Learning for Sparse Reward Robotic Control Problems [79.07468367923619]
We propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC) We design an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards. We show that our proposed ActPC agent performs well in the face of sparse (extrinsic) reward signals and is competitive with or outperforms several powerful backprop-based RL approaches.
arXiv Detail & Related papers (2022-09-19T16:49:32Z)
Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform. We produce a closed-loop controller to reactively push objects in a continuous action space. We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z)
Monolithic vs. hybrid controller for multi-objective Sim-to-Real learning [58.32117053812925]
Simulation to real (Sim-to-Real) is an attractive approach to construct controllers for robotic tasks. In this work, we compare two approaches in the multi-objective setting of a robot manipulator to reach a target while avoiding an obstacle. Our findings show that the training of a hybrid controller is easier and obtains a better success-failure trade-off than a monolithic controller.
arXiv Detail & Related papers (2021-08-17T09:02:33Z)
Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics [17.660913275007317]
We present a hybrid control strategy that combines the strengths of traditional hand-crafted controllers and model-free deep reinforcement learning (RL) BCF thrives in the robotics domain, where reliable but suboptimal control priors exist for many tasks, but RL from scratch remains unsafe and data-inefficient. We show BCF's applicability to the zero-shot sim-to-real setting and its ability to deal with out-of-distribution states in the real world.
arXiv Detail & Related papers (2021-07-21T00:43:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.