Benchmarking Deep Reinforcement Learning Algorithms for Vision-based
Robotics
- URL: http://arxiv.org/abs/2201.04224v1
- Date: Tue, 11 Jan 2022 22:45:25 GMT
- Title: Benchmarking Deep Reinforcement Learning Algorithms for Vision-based
Robotics
- Authors: Swagat Kumar, Hayden Sampson, Ardhendu Behera
- Abstract summary: This paper presents a benchmarking study of some of the state-of-the-art reinforcement learning algorithms used for solving two vision-based robotics problems.
The performances of these algorithms are compared against PyBullet's two simulation environments known as KukaDiverseObjectEnv and RacecarZEDGymEnv respectively.
- Score: 11.225021326001778
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a benchmarking study of some of the state-of-the-art
reinforcement learning algorithms used for solving two simulated vision-based
robotics problems. The algorithms considered in this study include soft
actor-critic (SAC), proximal policy optimization (PPO), interpolated policy
gradients (IPG), and their variants with Hindsight Experience replay (HER). The
performances of these algorithms are compared against PyBullet's two simulation
environments known as KukaDiverseObjectEnv and RacecarZEDGymEnv respectively.
The state observations in these environments are available in the form of RGB
images and the action space is continuous, making them difficult to solve. A
number of strategies are suggested to provide intermediate hindsight goals
required for implementing HER algorithm on these problems which are essentially
single-goal environments. In addition, a number of feature extraction
architectures are proposed to incorporate spatial and temporal attention in the
learning process. Through rigorous simulation experiments, the improvement
achieved with these components are established. To the best of our knowledge,
such a benchmarking study is not available for the above two vision-based
robotics problems making it a novel contribution in the field.
Related papers
- Real Evaluations Tractability using Continuous Goal-Directed Actions in
Smart City Applications [3.1158660854608824]
Continuous Goal-Directed Actions (CGDA) encodes actions as changes of any feature that can be extracted from the environment.
Current strategies involve performing evaluations in a simulation, transferring the final joint trajectory to the actual robot.
Two different approaches to reduce the number of evaluations using EA, are proposed and compared.
arXiv Detail & Related papers (2024-02-01T15:38:21Z) - Graphical Object-Centric Actor-Critic [55.2480439325792]
We propose a novel object-centric reinforcement learning algorithm combining actor-critic and model-based approaches.
We use a transformer encoder to extract object representations and graph neural networks to approximate the dynamics of an environment.
Our algorithm performs better in a visually complex 3D robotic environment and a 2D environment with compositional structure than the state-of-the-art model-free actor-critic algorithm.
arXiv Detail & Related papers (2023-10-26T06:05:12Z) - Discovering General Reinforcement Learning Algorithms with Adversarial
Environment Design [54.39859618450935]
We show that it is possible to meta-learn update rules, with the hope of discovering algorithms that can perform well on a wide range of RL tasks.
Despite impressive initial results from algorithms such as Learned Policy Gradient (LPG), there remains a gap when these algorithms are applied to unseen environments.
In this work, we examine how characteristics of the meta-supervised-training distribution impact the performance of these algorithms.
arXiv Detail & Related papers (2023-10-04T12:52:56Z) - Contribution \`a l'Optimisation d'un Comportement Collectif pour un
Groupe de Robots Autonomes [0.0]
This thesis studies the domain of collective robotics, and more particularly the optimization problems of multirobot systems.
The first contribution is the use of the Butterfly Algorithm Optimization (BOA) to solve the Unknown Area Exploration problem.
The second contribution is the development of a new simulation framework for benchmarking dynamic incremental problems in robotics.
arXiv Detail & Related papers (2023-06-10T21:49:08Z) - A Survey on Deep Learning-Based Monocular Spacecraft Pose Estimation:
Current State, Limitations and Prospects [7.08026800833095]
Estimating the pose of an uncooperative spacecraft is an important computer vision problem for enabling vision-based systems in orbit.
Following the general trend in computer vision, more and more works have been focusing on leveraging Deep Learning (DL) methods to address this problem.
Despite promising research-stage results, major challenges preventing the use of such methods in real-life missions still stand in the way.
arXiv Detail & Related papers (2023-05-12T09:52:53Z) - Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for
Robotics Control with Action Constraints [9.293472255463454]
This study presents a benchmark for evaluating action-constrained reinforcement learning (RL) algorithms.
We evaluate existing algorithms and their novel variants across multiple robotics control environments.
arXiv Detail & Related papers (2023-04-18T05:45:09Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Lexicographic Multi-Objective Reinforcement Learning [65.90380946224869]
We present a family of both action-value and policy gradient algorithms that can be used to solve such problems.
We show how our algorithms can be used to impose safety constraints on the behaviour of an agent, and compare their performance in this context with that of other constrained reinforcement learning algorithms.
arXiv Detail & Related papers (2022-12-28T10:22:36Z) - Solving the vehicle routing problem with deep reinforcement learning [0.0]
This paper focuses on the application of RL for the Vehicle Routing Problem (VRP), a famous problem that belongs to the class of NP-Hard problems.
In a second phase, the neural architecture behind the Actor and Critic has been established, choosing to adopt a neural architecture based on the Convolutional neural networks.
Experiments performed on a wide range of instances show that the algorithm has good generalization capabilities and can reach good solutions in a short time.
arXiv Detail & Related papers (2022-07-30T12:34:26Z) - Composable Learning with Sparse Kernel Representations [110.19179439773578]
We present a reinforcement learning algorithm for learning sparse non-parametric controllers in a Reproducing Kernel Hilbert Space.
We improve the sample complexity of this approach by imposing a structure of the state-action function through a normalized advantage function.
We demonstrate the performance of this algorithm on learning obstacle-avoidance policies in multiple simulations of a robot equipped with a laser scanner while navigating in a 2D environment.
arXiv Detail & Related papers (2021-03-26T13:58:23Z) - A User's Guide to Calibrating Robotics Simulators [54.85241102329546]
This paper proposes a set of benchmarks and a framework for the study of various algorithms aimed to transfer models and policies learnt in simulation to the real world.
We conduct experiments on a wide range of well known simulated environments to characterize and offer insights into the performance of different algorithms.
Our analysis can be useful for practitioners working in this area and can help make informed choices about the behavior and main properties of sim-to-real algorithms.
arXiv Detail & Related papers (2020-11-17T22:24:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.