Dependability Analysis of Deep Reinforcement Learning based Robotics and
Autonomous Systems
- URL: http://arxiv.org/abs/2109.06523v1
- Date: Tue, 14 Sep 2021 08:42:29 GMT
- Title: Dependability Analysis of Deep Reinforcement Learning based Robotics and
Autonomous Systems
- Authors: Yi Dong, Xingyu Zhao, Xiaowei Huang
- Abstract summary: Black-box nature of Deep Reinforcement Learning (DRL) and uncertain deployment-environments of Robotics pose new challenges on its dependability.
In this paper, we define a set of dependability properties in temporal logic and construct a Discrete-Time Markov Chain (DTMC) to model the dynamics of risk/failures of a DRL-driven RAS.
Our experimental results show that the proposed method is effective as a holistic assessment framework, while uncovers conflicts between the properties that may need trade-offs in the training.
- Score: 10.499662874457998
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While Deep Reinforcement Learning (DRL) provides transformational
capabilities to the control of Robotics and Autonomous Systems (RAS), the
black-box nature of DRL and uncertain deployment-environments of RAS pose new
challenges on its dependability. Although there are many existing works
imposing constraints on the DRL policy to ensure a successful completion of the
mission, it is far from adequate in terms of assessing the DRL-driven RAS in a
holistic way considering all dependability properties. In this paper, we
formally define a set of dependability properties in temporal logic and
construct a Discrete-Time Markov Chain (DTMC) to model the dynamics of
risk/failures of a DRL-driven RAS interacting with the stochastic environment.
We then do Probabilistic Model Checking based on the designed DTMC to verify
those properties. Our experimental results show that the proposed method is
effective as a holistic assessment framework, while uncovers conflicts between
the properties that may need trade-offs in the training. Moreover, we find the
standard DRL training cannot improve dependability properties, thus requiring
bespoke optimisation objectives concerning them. Finally, our method offers a
novel dependability analysis to the Sim-to-Real challenge of DRL.
Related papers
- Distributionally Robust Constrained Reinforcement Learning under Strong Duality [37.76993170360821]
We study the problem of Distributionally Robust Constrained RL (DRC-RL)
The goal is to maximize the expected reward subject to environmental distribution shifts and constraints.
We develop an algorithmic framework based on strong duality that enables the first efficient and provable solution.
arXiv Detail & Related papers (2024-06-22T08:51:57Z) - Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning [53.3760591018817]
We propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and Deep Reinforcement Learning.
Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques.
Our empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results.
arXiv Detail & Related papers (2024-05-30T23:20:23Z) - Critic-Guided Decision Transformer for Offline Reinforcement Learning [28.211835303617118]
Critic-Guided Decision Transformer (CGDT)
Uses predictability of long-term returns from value-based methods with the trajectory modeling capability of the Decision Transformer.
Builds upon these insights, we propose a novel approach, which combines the predictability of long-term returns from value-based methods with the trajectory modeling capability of the Decision Transformer.
arXiv Detail & Related papers (2023-12-21T10:29:17Z) - Understanding, Predicting and Better Resolving Q-Value Divergence in
Offline-RL [86.0987896274354]
We first identify a fundamental pattern, self-excitation, as the primary cause of Q-value estimation divergence in offline RL.
We then propose a novel Self-Excite Eigenvalue Measure (SEEM) metric to measure the evolving property of Q-network at training.
For the first time, our theory can reliably decide whether the training will diverge at an early stage.
arXiv Detail & Related papers (2023-10-06T17:57:44Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - Online Policy Optimization for Robust MDP [17.995448897675068]
Reinforcement learning (RL) has exceeded human performance in many synthetic settings such as video games and Go.
In this work, we consider online robust Markov decision process (MDP) by interacting with an unknown nominal system.
We propose a robust optimistic policy optimization algorithm that is provably efficient.
arXiv Detail & Related papers (2022-09-28T05:18:20Z) - Robust Reinforcement Learning using Offline Data [23.260211453437055]
We propose a robust reinforcement learning algorithm called Robust Fitted Q-Iteration (RFQI)
RFQI uses only an offline dataset to learn the optimal robust policy.
We prove that RFQI learns a near-optimal robust policy under standard assumptions.
arXiv Detail & Related papers (2022-08-10T03:47:45Z) - Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation [78.17108227614928]
We propose a benchmark environment for Safe Reinforcement Learning focusing on aquatic navigation.
We consider a value-based and policy-gradient Deep Reinforcement Learning (DRL)
We also propose a verification strategy that checks the behavior of the trained models over a set of desired properties.
arXiv Detail & Related papers (2021-12-16T16:53:56Z) - Pessimistic Model Selection for Offline Deep Reinforcement Learning [56.282483586473816]
Deep Reinforcement Learning (DRL) has demonstrated great potentials in solving sequential decision making problems in many applications.
One main barrier is the over-fitting issue that leads to poor generalizability of the policy learned by DRL.
We propose a pessimistic model selection (PMS) approach for offline DRL with a theoretical guarantee.
arXiv Detail & Related papers (2021-11-29T06:29:49Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.