Mutation Testing of Deep Reinforcement Learning Based on Real Faults
- URL: http://arxiv.org/abs/2301.05651v1
- Date: Fri, 13 Jan 2023 16:45:56 GMT
- Title: Mutation Testing of Deep Reinforcement Learning Based on Real Faults
- Authors: Florian Tambon, Vahid Majdinasab, Amin Nikanjam, Foutse Khomh,
Giuliano Antonio
- Abstract summary: This paper builds on the existing approach of Mutation Testing (MT) to extend it to Reinforcement Learning (RL) systems.
We show that the design choice of the mutation killing definition can affect whether or not a mutation is killed as well as the generated test cases.
- Score: 11.584571002297217
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Testing Deep Learning (DL) systems is a complex task as they do not behave
like traditional systems would, notably because of their stochastic nature.
Nonetheless, being able to adapt existing testing techniques such as Mutation
Testing (MT) to DL settings would greatly improve their potential
verifiability. While some efforts have been made to extend MT to the Supervised
Learning paradigm, little work has gone into extending it to Reinforcement
Learning (RL) which is also an important component of the DL ecosystem but
behaves very differently from SL. This paper builds on the existing approach of
MT in order to propose a framework, RLMutation, for MT applied to RL. Notably,
we use existing taxonomies of faults to build a set of mutation operators
relevant to RL and use a simple heuristic to generate test cases for RL. This
allows us to compare different mutation killing definitions based on existing
approaches, as well as to analyze the behavior of the obtained mutation
operators and their potential combinations called Higher Order Mutation(s)
(HOM). We show that the design choice of the mutation killing definition can
affect whether or not a mutation is killed as well as the generated test cases.
Moreover, we found that even with a relatively small number of test cases and
operators we manage to generate HOM with interesting properties which can
enhance testing capability in RL systems.
Related papers
- muPRL: A Mutation Testing Pipeline for Deep Reinforcement Learning based on Real Faults [19.32186653723838]
We first describe a taxonomy of real RL faults obtained by repository mining.
Then, we present the mutation operators derived from such real faults and implemented in the tool muPRL.
We discuss the experimental results, showing that muPRL is effective at discriminating strong from weak test generators.
arXiv Detail & Related papers (2024-08-27T15:45:13Z) - Multi-Granularity Semantic Revision for Large Language Model Distillation [66.03746866578274]
We propose a multi-granularity semantic revision method for LLM distillation.
At the sequence level, we propose a sequence correction and re-generation strategy.
At the token level, we design a distribution adaptive clipping Kullback-Leibler loss as the distillation objective function.
At the span level, we leverage the span priors of a sequence to compute the probability correlations within spans, and constrain the teacher and student's probability correlations to be consistent.
arXiv Detail & Related papers (2024-07-14T03:51:49Z) - Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing [63.20133320524577]
Large Language Models (LLMs) have demonstrated great potential as generalist assistants.
It is crucial that these models exhibit desirable behavioral traits, such as non-toxicity and resilience against jailbreak attempts.
In this paper, we observe that directly editing a small subset of parameters can effectively modulate specific behaviors of LLMs.
arXiv Detail & Related papers (2024-07-11T17:52:03Z) - An Exploratory Study on Using Large Language Models for Mutation Testing [32.91472707292504]
Large Language Models (LLMs) have shown great potential in code-related tasks but their utility in mutation testing remains unexplored.
This paper investigates the performance of LLMs in generating effective mutations to their usability, fault detection potential, and relationship with real bugs.
We find that compared to existing approaches, LLMs generate more diverse mutations that are behaviorally closer to real bugs.
arXiv Detail & Related papers (2024-06-14T08:49:41Z) - An Empirical Evaluation of Manually Created Equivalent Mutants [54.02049952279685]
Less than 10 % of manually created mutants are equivalent.
Surprisingly, our findings indicate that a significant portion of developers struggle to accurately identify equivalent mutants.
arXiv Detail & Related papers (2024-04-14T13:04:10Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - Solving Continual Offline Reinforcement Learning with Decision Transformer [78.59473797783673]
Continuous offline reinforcement learning (CORL) combines continuous and offline reinforcement learning.
Existing methods, employing Actor-Critic structures and experience replay (ER), suffer from distribution shifts, low efficiency, and weak knowledge-sharing.
We introduce multi-head DT (MH-DT) and low-rank adaptation DT (LoRA-DT) to mitigate DT's forgetting problem.
arXiv Detail & Related papers (2024-01-16T16:28:32Z) - Supervised Pretraining Can Learn In-Context Reinforcement Learning [96.62869749926415]
In this paper, we study the in-context learning capabilities of transformers in decision-making problems.
We introduce and study Decision-Pretrained Transformer (DPT), a supervised pretraining method where the transformer predicts an optimal action.
We find that the pretrained transformer can be used to solve a range of RL problems in-context, exhibiting both exploration online and conservatism offline.
arXiv Detail & Related papers (2023-06-26T17:58:50Z) - A Probabilistic Framework for Mutation Testing in Deep Neural Networks [12.033944769247958]
We propose a Probabilistic Mutation Testing (PMT) approach that alleviates the inconsistency problem.
PMT effectively allows a more consistent and informed decision on mutations through evaluation.
arXiv Detail & Related papers (2022-08-11T19:45:14Z) - DeepMetis: Augmenting a Deep Learning Test Set to Increase its Mutation
Score [4.444652484439581]
tool is effective at augmenting the given test set, increasing its capability to detect mutants by 63% on average.
A leave-one-out experiment shows that the augmented test set is capable of exposing unseen mutants.
arXiv Detail & Related papers (2021-09-15T18:20:50Z) - DeepMutation: A Neural Mutation Tool [26.482720255691646]
DeepMutation is a tool wrapping our deep learning model into a fully automated tool chain.
It can generate, inject, and test mutants learned from real faults.
arXiv Detail & Related papers (2020-02-12T01:57:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.