Off-Policy Deep Reinforcement Learning Algorithms for Handling Various
Robotic Manipulator Tasks
- URL: http://arxiv.org/abs/2212.05572v1
- Date: Sun, 11 Dec 2022 18:25:24 GMT
- Title: Off-Policy Deep Reinforcement Learning Algorithms for Handling Various
Robotic Manipulator Tasks
- Authors: Altun Rzayev, Vahid Tavakol Aghaei
- Abstract summary: In this study, three reinforcement learning algorithms; DDPG, TD3 and SAC have been used to train Fetch robotic manipulator for four different tasks.
All of these algorithms are off-policy and able to achieve their desired target by optimizing both policy and value functions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In order to avoid conventional controlling methods which created obstacles
due to the complexity of systems and intense demand on data density, developing
modern and more efficient control methods are required. In this way,
reinforcement learning off-policy and model-free algorithms help to avoid
working with complex models. In terms of speed and accuracy, they become
prominent methods because the algorithms use their past experience to learn the
optimal policies. In this study, three reinforcement learning algorithms; DDPG,
TD3 and SAC have been used to train Fetch robotic manipulator for four
different tasks in MuJoCo simulation environment. All of these algorithms are
off-policy and able to achieve their desired target by optimizing both policy
and value functions. In the current study, the efficiency and the speed of
these three algorithms are analyzed in a controlled environment.
Related papers
- Edge Delayed Deep Deterministic Policy Gradient: efficient continuous control for edge scenarios [5.446048322940114]
We introduce a novel Reinforcement Learning algorithm tailored for edge scenarios, called Edge Delayed Deep Deterministic Policy Gradient (EdgeD3)
In this work, we introduce a novel Reinforcement Learning algorithm tailored for edge scenarios, called Edge Delayed Deep Deterministic Policy Gradient (EdgeD3)
arXiv Detail & Related papers (2024-12-09T11:17:04Z) - Mission-driven Exploration for Accelerated Deep Reinforcement Learning
with Temporal Logic Task Specifications [11.812602599752294]
We consider robots with unknown dynamics operating in environments with unknown structure.
Our goal is to synthesize a control policy that maximizes the probability of satisfying an automaton-encoded task.
We propose a novel DRL algorithm, which has the capability to learn control policies at a notably faster rate compared to similar methods.
arXiv Detail & Related papers (2023-11-28T18:59:58Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z) - Learning Robust Policy against Disturbance in Transition Dynamics via
State-Conservative Policy Optimization [63.75188254377202]
Deep reinforcement learning algorithms can perform poorly in real-world tasks due to discrepancy between source and target environments.
We propose a novel model-free actor-critic algorithm to learn robust policies without modeling the disturbance in advance.
Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.
arXiv Detail & Related papers (2021-12-20T13:13:05Z) - AWD3: Dynamic Reduction of the Estimation Bias [0.0]
We introduce a technique that eliminates the estimation bias in off-policy continuous control algorithms using the experience replay mechanism.
We show through continuous control environments of OpenAI gym that our algorithm matches or outperforms the state-of-the-art off-policy policy gradient learning algorithms.
arXiv Detail & Related papers (2021-11-12T15:46:19Z) - Learning Sampling Policy for Faster Derivative Free Optimization [100.27518340593284]
We propose a new reinforcement learning based ZO algorithm (ZO-RL) with learning the sampling policy for generating the perturbations in ZO optimization instead of using random sampling.
Our results show that our ZO-RL algorithm can effectively reduce the variances of ZO gradient by learning a sampling policy, and converge faster than existing ZO algorithms in different scenarios.
arXiv Detail & Related papers (2021-04-09T14:50:59Z) - A Two-stage Framework and Reinforcement Learning-based Optimization
Algorithms for Complex Scheduling Problems [54.61091936472494]
We develop a two-stage framework, in which reinforcement learning (RL) and traditional operations research (OR) algorithms are combined together.
The scheduling problem is solved in two stages, including a finite Markov decision process (MDP) and a mixed-integer programming process, respectively.
Results show that the proposed algorithms could stably and efficiently obtain satisfactory scheduling schemes for agile Earth observation satellite scheduling problems.
arXiv Detail & Related papers (2021-03-10T03:16:12Z) - A review of motion planning algorithms for intelligent robotics [0.8594140167290099]
We investigate and analyze principles of typical motion planning algorithms.
Traditional planning algorithms include graph search algorithms, sampling-based algorithms, and interpolating curve algorithms.
Supervised learning algorithms include MSVM, LSTM, MCTS and CNN.
Policy gradient algorithms include policy gradient method, actor-critic algorithm, A3C, A2C, DPG, DDPG, TRPO and PPO.
arXiv Detail & Related papers (2021-02-04T02:24:04Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Reinforcement Learning with Fast Stabilization in Linear Dynamical
Systems [91.43582419264763]
We study model-based reinforcement learning (RL) in unknown stabilizable linear dynamical systems.
We propose an algorithm that certifies fast stabilization of the underlying system by effectively exploring the environment.
We show that the proposed algorithm attains $tildemathcalO(sqrtT)$ regret after $T$ time steps of agent-environment interaction.
arXiv Detail & Related papers (2020-07-23T23:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.