Testing of Deep Reinforcement Learning Agents with Surrogate Models
- URL: http://arxiv.org/abs/2305.12751v2
- Date: Sat, 11 Nov 2023 15:10:56 GMT
- Title: Testing of Deep Reinforcement Learning Agents with Surrogate Models
- Authors: Matteo Biagiola, Paolo Tonella
- Abstract summary: Deep Reinforcement Learning (DRL) has received a lot of attention from the research community in recent years.
In this paper, we propose a search-based approach to test such agents.
- Score: 10.243488468625786
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Reinforcement Learning (DRL) has received a lot of attention from the
research community in recent years. As the technology moves away from game
playing to practical contexts, such as autonomous vehicles and robotics, it is
crucial to evaluate the quality of DRL agents. In this paper, we propose a
search-based approach to test such agents. Our approach, implemented in a tool
called Indago, trains a classifier on failure and non-failure environment
(i.e., pass) configurations resulting from the DRL training process. The
classifier is used at testing time as a surrogate model for the DRL agent
execution in the environment, predicting the extent to which a given
environment configuration induces a failure of the DRL agent under test. The
failure prediction acts as a fitness function, guiding the generation towards
failure environment configurations, while saving computation time by deferring
the execution of the DRL agent in the environment to those configurations that
are more likely to expose failures. Experimental results show that our
search-based approach finds 50% more failures of the DRL agent than
state-of-the-art techniques. Moreover, such failures are, on average, 78% more
diverse; similarly, the behaviors of the DRL agent induced by failure
configurations are 74% more diverse.
Related papers
- muPRL: A Mutation Testing Pipeline for Deep Reinforcement Learning based on Real Faults [19.32186653723838]
We first describe a taxonomy of real RL faults obtained by repository mining.
Then, we present the mutation operators derived from such real faults and implemented in the tool muPRL.
We discuss the experimental results, showing that muPRL is effective at discriminating strong from weak test generators.
arXiv Detail & Related papers (2024-08-27T15:45:13Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement
Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.
We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.
These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - Can Agents Run Relay Race with Strangers? Generalization of RL to
Out-of-Distribution Trajectories [88.08381083207449]
We show the prevalence of emphgeneralization failure on controllable states from stranger agents.
We propose a novel method called Self-Trajectory Augmentation (STA), which will reset the environment to the agent's old states according to the Q function during training.
arXiv Detail & Related papers (2023-04-26T10:12:12Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - A Search-Based Testing Approach for Deep Reinforcement Learning Agents [1.1580916951856255]
We propose a Search-based Testing Approach of Reinforcement Learning Agents (STARLA) to test the policy of a DRL agent.
We use machine learning models and a dedicated genetic algorithm to narrow the search towards faulty episodes.
arXiv Detail & Related papers (2022-06-15T20:51:33Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - Enhancing the Generalization Performance and Speed Up Training for
DRL-based Mapless Navigation [18.13884934663477]
DRL agents performing well in training scenarios are found to perform poorly in some unseen real-world scenarios.
In this paper, we discuss why the DRL agent fails in such unseen scenarios and find the representation of LiDAR readings is the key factor behind the agent's performance degradation.
We propose an easy, but efficient input pre-processing (IP) approach to accelerate training and enhance the performance of the DRL agent in such scenarios.
arXiv Detail & Related papers (2021-03-22T09:36:51Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - Auto-Agent-Distiller: Towards Efficient Deep Reinforcement Learning Agents via Neural Architecture Search [15.3602645148428]
We propose an Auto-Agent-Distiller (A2D) framework to automatically search for the optimal DRL agents for various tasks.
We demonstrate that vanilla NAS can easily fail in searching for the optimal agents, due to its resulting high variance in DRL training stability.
We then develop a novel distillation mechanism to distill the knowledge from both the teacher agent's actor and critic to stabilize the searching process and improve the searched agents' optimality.
arXiv Detail & Related papers (2020-12-24T04:07:36Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.