A Search-Based Testing Approach for Deep Reinforcement Learning Agents
- URL: http://arxiv.org/abs/2206.07813v4
- Date: Fri, 4 Aug 2023 19:38:40 GMT
- Title: A Search-Based Testing Approach for Deep Reinforcement Learning Agents
- Authors: Amirhossein Zolfagharian, Manel Abdellatif, Lionel Briand, Mojtaba
Bagherzadeh and Ramesh S
- Abstract summary: We propose a Search-based Testing Approach of Reinforcement Learning Agents (STARLA) to test the policy of a DRL agent.
We use machine learning models and a dedicated genetic algorithm to narrow the search towards faulty episodes.
- Score: 1.1580916951856255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Reinforcement Learning (DRL) algorithms have been increasingly employed
during the last decade to solve various decision-making problems such as
autonomous driving and robotics. However, these algorithms have faced great
challenges when deployed in safety-critical environments since they often
exhibit erroneous behaviors that can lead to potentially critical errors. One
way to assess the safety of DRL agents is to test them to detect possible
faults leading to critical failures during their execution. This raises the
question of how we can efficiently test DRL policies to ensure their
correctness and adherence to safety requirements. Most existing works on
testing DRL agents use adversarial attacks that perturb states or actions of
the agent. However, such attacks often lead to unrealistic states of the
environment. Their main goal is to test the robustness of DRL agents rather
than testing the compliance of agents' policies with respect to requirements.
Due to the huge state space of DRL environments, the high cost of test
execution, and the black-box nature of DRL algorithms, the exhaustive testing
of DRL agents is impossible. In this paper, we propose a Search-based Testing
Approach of Reinforcement Learning Agents (STARLA) to test the policy of a DRL
agent by effectively searching for failing executions of the agent within a
limited testing budget. We use machine learning models and a dedicated genetic
algorithm to narrow the search towards faulty episodes. We apply STARLA on
Deep-Q-Learning agents which are widely used as benchmarks and show that it
significantly outperforms Random Testing by detecting more faults related to
the agent's policy. We also investigate how to extract rules that characterize
faulty episodes of the DRL agent using our search results. Such rules can be
used to understand the conditions under which the agent fails and thus assess
its deployment risks.
Related papers
- GUARD: A Safe Reinforcement Learning Benchmark [11.887626936994883]
Generalized Unified SAfe Reinforcement Learning Development Benchmark.
GUARD is a generalized benchmark with a wide variety of RL agents, tasks, and safety constraint specifications.
We present a comparison of state-of-the-art safe RL algorithms in various task settings using GUARD and establish baselines that future work can build on.
arXiv Detail & Related papers (2023-05-23T04:40:29Z) - Testing of Deep Reinforcement Learning Agents with Surrogate Models [10.243488468625786]
Deep Reinforcement Learning (DRL) has received a lot of attention from the research community in recent years.
In this paper, we propose a search-based approach to test such agents.
arXiv Detail & Related papers (2023-05-22T06:21:39Z) - Train Hard, Fight Easy: Robust Meta Reinforcement Learning [78.16589993684698]
A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients.
Standard MRL methods optimize the average return over tasks, but often suffer from poor results in tasks of high risk or difficulty.
In this work, we define a robust MRL objective with a controlled level.
The data inefficiency is addressed via the novel Robust Meta RL algorithm (RoML)
arXiv Detail & Related papers (2023-01-26T14:54:39Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - A Comparison of Reinforcement Learning Frameworks for Software Testing
Tasks [14.22330197686511]
Deep Reinforcement Learning (DRL) has been successfully employed in complex testing tasks such as game testing, regression testing, and test case prioritization.
DRL frameworks offer well-maintained implemented state-of-the-art DRL algorithms to facilitate and speed up the development of DRL applications.
There is no study that empirically evaluates the effectiveness and performance of implemented algorithms in DRL frameworks.
arXiv Detail & Related papers (2022-08-25T14:52:16Z) - Search-Based Testing of Reinforcement Learning [0.0]
We present a search-based testing framework for evaluating the safety and performance of deep RL agents.
For safety testing, our framework utilizes a search algorithm that searches for a reference trace that solves the RL task.
For robust performance testing, we create a diverse set of traces via fuzz testing.
We apply our search-based testing approach on RL for Nintendo's Super Mario Bros.
arXiv Detail & Related papers (2022-05-07T12:40:45Z) - URLB: Unsupervised Reinforcement Learning Benchmark [82.36060735454647]
We introduce the Unsupervised Reinforcement Learning Benchmark (URLB)
URLB consists of two phases: reward-free pre-training and downstream task adaptation with extrinsic rewards.
We provide twelve continuous control tasks from three domains for evaluation and open-source code for eight leading unsupervised RL methods.
arXiv Detail & Related papers (2021-10-28T15:07:01Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z) - Robust Deep Reinforcement Learning against Adversarial Perturbations on
State Observations [88.94162416324505]
A deep reinforcement learning (DRL) agent observes its states through observations, which may contain natural measurement errors or adversarial noises.
Since the observations deviate from the true states, they can mislead the agent into making suboptimal actions.
We show that naively applying existing techniques on improving robustness for classification tasks, like adversarial training, is ineffective for many RL tasks.
arXiv Detail & Related papers (2020-03-19T17:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.