Obstacle Avoidance for Robotic Manipulator in Joint Space via Improved
Proximal Policy Optimization
- URL: http://arxiv.org/abs/2210.00803v1
- Date: Mon, 3 Oct 2022 10:21:57 GMT
- Title: Obstacle Avoidance for Robotic Manipulator in Joint Space via Improved
Proximal Policy Optimization
- Authors: Yongliang Wang and Hamidreza Kasaei
- Abstract summary: In this paper, we train a deep neural network via an improved Proximal Policy Optimization (PPO) algorithm to map from task space to joint space for a 6-DoF manipulator.
Since training such a task in real-robot is time-consuming and strenuous, we develop a simulation environment to train the model.
Experimental results showed that using our method, the robot was capable of tracking a single target or reaching multiple targets in unstructured environments.
- Score: 6.067589886362815
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Reaching tasks with random targets and obstacles can still be challenging
when the robotic arm is operating in unstructured environments. In contrast to
traditional model-based methods, model-free reinforcement learning methods do
not require complex inverse kinematics or dynamics equations to be calculated.
In this paper, we train a deep neural network via an improved Proximal Policy
Optimization (PPO) algorithm, which aims to map from task space to joint space
for a 6-DoF manipulator. In particular, we modify the original PPO and design
an effective representation for environmental inputs and outputs to train the
robot faster in a larger workspace. Firstly, a type of action ensemble is
adopted to improve output efficiency. Secondly, the policy is designed to join
in value function updates directly. Finally, the distance between obstacles and
links of the manipulator is calculated based on a geometry method as part of
the representation of states. Since training such a task in real-robot is
time-consuming and strenuous, we develop a simulation environment to train the
model. We choose Gazebo as our first simulation environment since it often
produces a smaller Sim-to-Real gap than other simulators. However, the training
process in Gazebo is time-consuming and takes a long time. Therefore, to
address this limitation, we propose a Sim-to-Sim method to reduce the training
time significantly. The trained model is finally used in a real-robot setup
without fine-tuning. Experimental results showed that using our method, the
robot was capable of tracking a single target or reaching multiple targets in
unstructured environments.
Related papers
- Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Learning to navigate efficiently and precisely in real environments [14.52507964172957]
Embodied AI literature focuses on end-to-end agents trained in simulators like Habitat or AI-Thor.
In this work we explore end-to-end training of agents in simulation in settings which minimize the sim2real gap.
arXiv Detail & Related papers (2024-01-25T17:50:05Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Learning Bipedal Walking for Humanoids with Current Feedback [5.429166905724048]
We present an approach for overcoming the sim2real gap issue for humanoid robots arising from inaccurate torque-tracking at the actuator level.
Our approach successfully trains a unified, end-to-end policy in simulation that can be deployed on a real HRP-5P humanoid robot to achieve bipedal locomotion.
arXiv Detail & Related papers (2023-03-07T08:16:46Z) - Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse
Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot.
We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z) - DiffSkill: Skill Abstraction from Differentiable Physics for Deformable
Object Manipulations with Tools [96.38972082580294]
DiffSkill is a novel framework that uses a differentiable physics simulator for skill abstraction to solve deformable object manipulation tasks.
In particular, we first obtain short-horizon skills using individual tools from a gradient-based simulator.
We then learn a neural skill abstractor from the demonstration trajectories which takes RGBD images as input.
arXiv Detail & Related papers (2022-03-31T17:59:38Z) - Off Environment Evaluation Using Convex Risk Minimization [0.0]
We propose a convex risk minimization algorithm to estimate the model mismatch between the simulator and the target domain.
We show that this estimator can be used along with the simulator to evaluate performance of an RL agents in the target domain.
arXiv Detail & Related papers (2021-12-21T21:31:54Z) - SAGCI-System: Towards Sample-Efficient, Generalizable, Compositional,
and Incremental Robot Learning [41.19148076789516]
We introduce a systematic learning framework called SAGCI-system towards achieving the above four requirements.
Our system first takes the raw point clouds gathered by the camera mounted on the robot's wrist as the inputs and produces initial modeling of the surrounding environment represented as a URDF.
The robot then utilizes the interactive perception to interact with the environments to online verify and modify the URDF.
arXiv Detail & Related papers (2021-11-29T16:53:49Z) - Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform.
We produce a closed-loop controller to reactively push objects in a continuous action space.
We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z) - An advantage actor-critic algorithm for robotic motion planning in dense
and dynamic scenarios [0.8594140167290099]
In this paper, we modify existing advantage actor-critic algorithm and suit it to complex motion planning.
It achieves higher success rate in motion planning with lesser processing time for robot to reach its goal.
arXiv Detail & Related papers (2021-02-05T12:30:23Z) - Reactive Long Horizon Task Execution via Visual Skill and Precondition
Models [59.76233967614774]
We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner.
We show an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines.
arXiv Detail & Related papers (2020-11-17T15:24:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.