Multimodal Safety-Critical Scenarios Generation for Decision-Making
Algorithms Evaluation
- URL: http://arxiv.org/abs/2009.08311v3
- Date: Sat, 26 Dec 2020 16:54:12 GMT
- Title: Multimodal Safety-Critical Scenarios Generation for Decision-Making
Algorithms Evaluation
- Authors: Wenhao Ding, Baiming Chen, Bo Li, Kim Ji Eun, Ding Zhao
- Abstract summary: Existing neural network-based autonomous systems are shown to be vulnerable against adversarial attacks.
We propose a flow-based multimodal safety-critical scenario generator for evaluating decisionmaking algorithms.
We evaluate six Reinforcement Learning algorithms with our generated traffic scenarios and provide empirical conclusions about their robustness.
- Score: 23.43175124406634
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing neural network-based autonomous systems are shown to be vulnerable
against adversarial attacks, therefore sophisticated evaluation on their
robustness is of great importance. However, evaluating the robustness only
under the worst-case scenarios based on known attacks is not comprehensive, not
to mention that some of them even rarely occur in the real world. In addition,
the distribution of safety-critical data is usually multimodal, while most
traditional attacks and evaluation methods focus on a single modality. To solve
the above challenges, we propose a flow-based multimodal safety-critical
scenario generator for evaluating decisionmaking algorithms. The proposed
generative model is optimized with weighted likelihood maximization and a
gradient-based sampling procedure is integrated to improve the sampling
efficiency. The safety-critical scenarios are generated by querying the task
algorithms and the log-likelihood of the generated scenarios is in proportion
to the risk level. Experiments on a self-driving task demonstrate our
advantages in terms of testing efficiency and multimodal modeling capability.
We evaluate six Reinforcement Learning algorithms with our generated traffic
scenarios and provide empirical conclusions about their robustness.
Related papers
- Risk-Averse Certification of Bayesian Neural Networks [70.44969603471903]
We propose a Risk-Averse Certification framework for Bayesian neural networks called RAC-BNN.
Our method leverages sampling and optimisation to compute a sound approximation of the output set of a BNN.
We validate RAC-BNN on a range of regression and classification benchmarks and compare its performance with a state-of-the-art method.
arXiv Detail & Related papers (2024-11-29T14:22:51Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.
Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.
However, the deployment of these agents in physical environments presents significant safety challenges.
This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - ASSERT: Automated Safety Scenario Red Teaming for Evaluating the
Robustness of Large Language Models [65.79770974145983]
ASSERT, Automated Safety Scenario Red Teaming, consists of three methods -- semantically aligned augmentation, target bootstrapping, and adversarial knowledge injection.
We partition our prompts into four safety domains for a fine-grained analysis of how the domain affects model performance.
We find statistically significant performance differences of up to 11% in absolute classification accuracy among semantically related scenarios and error rates of up to 19% absolute error in zero-shot adversarial settings.
arXiv Detail & Related papers (2023-10-14T17:10:28Z) - CausalAF: Causal Autoregressive Flow for Safety-Critical Driving
Scenario Generation [34.45216283597149]
We propose a flow-based generative framework, Causal Autoregressive Flow (CausalAF)
CausalAF encourages the generative model to uncover and follow the causal relationship among generated objects.
We show that using generated scenarios as additional training samples empirically improves the robustness of autonomous driving algorithms.
arXiv Detail & Related papers (2021-10-26T18:07:48Z) - Generating and Characterizing Scenarios for Safety Testing of Autonomous
Vehicles [86.9067793493874]
We propose efficient mechanisms to characterize and generate testing scenarios using a state-of-the-art driving simulator.
We use our method to characterize real driving data from the Next Generation Simulation (NGSIM) project.
We rank the scenarios by defining metrics based on the complexity of avoiding accidents and provide insights into how the AV could have minimized the probability of incurring an accident.
arXiv Detail & Related papers (2021-03-12T17:00:23Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - Efficient falsification approach for autonomous vehicle validation using
a parameter optimisation technique based on reinforcement learning [6.198523595657983]
The widescale deployment of Autonomous Vehicles (AV) appears to be imminent despite many safety challenges that are yet to be resolved.
The uncertainties in the behaviour of the traffic participants and the dynamic world cause reactions in advanced autonomous systems.
This paper presents an efficient falsification method to evaluate the System Under Test.
arXiv Detail & Related papers (2020-11-16T02:56:13Z) - Neural Bridge Sampling for Evaluating Safety-Critical Autonomous Systems [34.945482759378734]
We employ a probabilistic approach to safety evaluation in simulation, where we are concerned with computing the probability of dangerous events.
We develop a novel rare-event simulation method that combines exploration, exploitation, and optimization techniques to find failure modes and estimate their rate of occurrence.
arXiv Detail & Related papers (2020-08-24T17:46:27Z) - SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics.
We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations.
We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z) - Efficient statistical validation with edge cases to evaluate Highly
Automated Vehicles [6.198523595657983]
The widescale deployment of Autonomous Vehicles seems to be imminent despite many safety challenges that are yet to be resolved.
Existing standards focus on deterministic processes where the validation requires only a set of test cases that cover the requirements.
This paper presents a new approach to compute the statistical characteristics of a system's behaviour by biasing automatically generated test cases towards the worst case scenarios.
arXiv Detail & Related papers (2020-03-04T04:35:22Z) - Learning to Collide: An Adaptive Safety-Critical Scenarios Generating
Method [20.280573307366627]
We propose a generative framework to create safety-critical scenarios for evaluating task algorithms.
We demonstrate that the proposed framework generates safety-critical scenarios more efficiently than grid search or human design methods.
arXiv Detail & Related papers (2020-03-02T21:26:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.