Safe Reinforcement Learning in a Simulated Robotic Arm
- URL: http://arxiv.org/abs/2312.09468v2
- Date: Wed, 28 Feb 2024 21:04:12 GMT
- Title: Safe Reinforcement Learning in a Simulated Robotic Arm
- Authors: Luka Kova\v{c} and Igor Farka\v{s}
- Abstract summary: Reinforcement learning (RL) agents need to explore their environments in order to learn optimal policies.
In this paper, we extend the applicability of safe RL algorithms by creating a customized environment with Panda robotic arm.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning (RL) agents need to explore their environments in
order to learn optimal policies. In many environments and tasks, safety is of
critical importance. The widespread use of simulators offers a number of
advantages, including safe exploration which will be inevitable in cases when
RL systems need to be trained directly in the physical environment (e.g. in
human-robot interaction). The popular Safety Gym library offers three mobile
agent types that can learn goal-directed tasks while considering various safety
constraints. In this paper, we extend the applicability of safe RL algorithms
by creating a customized environment with Panda robotic arm where Safety Gym
algorithms can be tested. We performed pilot experiments with the popular PPO
algorithm comparing the baseline with the constrained version and show that the
constrained version is able to learn the equally good policy while better
complying with safety constraints and taking longer training time as expected.
Related papers
- Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark [12.660770759420286]
We present an environment suite called Safety-Gymnasium, which encompasses safety-critical tasks in both single and multi-agent scenarios.
We offer a library of algorithms named Safe Policy Optimization (SafePO), comprising 16 state-of-the-art SafeRL algorithms.
arXiv Detail & Related papers (2023-10-19T08:19:28Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Provable Safe Reinforcement Learning with Binary Feedback [62.257383728544006]
We consider the problem of provable safe RL when given access to an offline oracle providing binary feedback on the safety of state, action pairs.
We provide a novel meta algorithm, SABRE, which can be applied to any MDP setting given access to a blackbox PAC RL algorithm for that setting.
arXiv Detail & Related papers (2022-10-26T05:37:51Z) - Safe Reinforcement Learning Using Black-Box Reachability Analysis [20.875010584486812]
Reinforcement learning (RL) is capable of sophisticated motion planning and control for robots in uncertain environments.
To justify widespread deployment, robots must respect safety constraints without sacrificing performance.
We propose a Black-box Reachability-based Safety Layer (BRSL) with three main components.
arXiv Detail & Related papers (2022-04-15T10:51:09Z) - Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and
Generalization Guarantees [7.6347172725540995]
Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world.
We propose Sim-to-Lab-to-Real to bridge the reality gap with a probabilistically guaranteed safety-aware policy distribution.
arXiv Detail & Related papers (2022-01-20T18:41:01Z) - Learning Barrier Certificates: Towards Safe Reinforcement Learning with
Zero Training-time Violations [64.39401322671803]
This paper explores the possibility of safe RL algorithms with zero training-time safety violations.
We propose an algorithm, Co-trained Barrier Certificate for Safe RL (CRABS), which iteratively learns barrier certificates, dynamics models, and policies.
arXiv Detail & Related papers (2021-08-04T04:59:05Z) - Learning to be Safe: Deep RL with a Safety Critic [72.00568333130391]
A natural first approach toward safe RL is to manually specify constraints on the policy's behavior.
We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
arXiv Detail & Related papers (2020-10-27T20:53:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.