ScenicNL: Generating Probabilistic Scenario Programs from Natural Language
- URL: http://arxiv.org/abs/2405.03709v3
- Date: Wed, 02 Oct 2024 22:58:42 GMT
- Title: ScenicNL: Generating Probabilistic Scenario Programs from Natural Language
- Authors: Karim Elmaaroufi, Devan Shanker, Ana Cismaru, Marcell Vazquez-Chanlatte, Alberto Sangiovanni-Vincentelli, Matei Zaharia, Sanjit A. Seshia,
- Abstract summary: We present ScenarioNL, an AI System for creating scenario programs from natural language.
We generate these programs from police crash reports.
We evaluate our system on publicly available autonomous vehicle crash reports in California from the last five years.
- Score: 22.314264838832287
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: For cyber-physical systems (CPS), including robotics and autonomous vehicles, mass deployment has been hindered by fatal errors that occur when operating in rare events. To replicate rare events such as vehicle crashes, many companies have created logging systems and employed crash reconstruction experts to meticulously recreate these valuable events in simulation. However, in these methods, "what if" questions are not easily formulated and answered. We present ScenarioNL, an AI System for creating scenario programs from natural language. Specifically, we generate these programs from police crash reports. Reports normally contain uncertainty about the exact details of the incidents which we represent through a Probabilistic Programming Language (PPL), Scenic. By using Scenic, we can clearly and concisely represent uncertainty and variation over CPS behaviors, properties, and interactions. We demonstrate how commonplace prompting techniques with the best Large Language Models (LLM) are incapable of reasoning about probabilistic scenario programs and generating code for low-resource languages such as Scenic. Our system is comprised of several LLMs chained together with several kinds of prompting strategies, a compiler, and a simulator. We evaluate our system on publicly available autonomous vehicle crash reports in California from the last five years and share insights into how we generate code that is both semantically meaningful and syntactically correct.
Related papers
- Generating Out-Of-Distribution Scenarios Using Language Models [58.47597351184034]
Large Language Models (LLMs) have shown promise in autonomous driving.
This paper introduces a framework for generating diverse Out-Of-Distribution (OOD) driving scenarios.
We evaluate our framework through extensive simulations and introduce a new "OOD-ness" metric.
arXiv Detail & Related papers (2024-11-25T16:38:17Z) - Generating Driving Simulations via Conversation [20.757088470174452]
We design a natural language interface to assist a non-coding domain expert in synthesising the desired scenarios and vehicle behaviours.
We show that using it to convert utterances to the symbolic program is feasible, despite the very small training dataset.
Human experiments show that dialogue is critical to successful simulation generation, leading to a 4.5 times higher success rate than a generation without engaging in extended conversation.
arXiv Detail & Related papers (2024-10-13T13:07:31Z) - LeGEND: A Top-Down Approach to Scenario Generation of Autonomous Driving Systems Assisted by Large Language Models [9.841914333647631]
We propose LeGEND, that features a top-down fashion of scenario generation.
It starts with abstract functional scenarios, and then steps downwards to logical and concrete scenarios.
Unlike logical scenarios that can be formally described, functional scenarios are often documented in natural languages.
arXiv Detail & Related papers (2024-09-16T08:01:21Z) - Compromising Embodied Agents with Contextual Backdoor Attacks [69.71630408822767]
Large language models (LLMs) have transformed the development of embodied intelligence.
This paper uncovers a significant backdoor security threat within this process.
By poisoning just a few contextual demonstrations, attackers can covertly compromise the contextual environment of a black-box LLM.
arXiv Detail & Related papers (2024-08-06T01:20:12Z) - ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles [17.396416459648755]
ChatScene is a Large Language Model (LLM)-based agent that generates safety-critical scenarios for autonomous vehicles.
A key part of our agent is a comprehensive knowledge retrieval component, which efficiently translates specific textual descriptions into corresponding domain-specific code snippets.
arXiv Detail & Related papers (2024-05-22T23:21:15Z) - Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it.
Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z) - A Conversational Paradigm for Program Synthesis [110.94409515865867]
We propose a conversational program synthesis approach via large language models.
We train a family of large language models, called CodeGen, on natural language and programming language data.
Our findings show the emergence of conversational capabilities and the effectiveness of the proposed conversational program synthesis paradigm.
arXiv Detail & Related papers (2022-03-25T06:55:15Z) - Generating and Characterizing Scenarios for Safety Testing of Autonomous
Vehicles [86.9067793493874]
We propose efficient mechanisms to characterize and generate testing scenarios using a state-of-the-art driving simulator.
We use our method to characterize real driving data from the Next Generation Simulation (NGSIM) project.
We rank the scenarios by defining metrics based on the complexity of avoiding accidents and provide insights into how the AV could have minimized the probability of incurring an accident.
arXiv Detail & Related papers (2021-03-12T17:00:23Z) - Scenic: A Language for Scenario Specification and Data Generation [17.07493567658614]
We propose a new probabilistic programming language for the design and analysis of cyber-physical systems.
In this paper, we focus on systems like autonomous cars and robots, whose environment at any point in time is a'scene'
We design a domain-specific language, Scenic, for describing scenarios that are distributions over scenes and the behaviors of their agents over time.
arXiv Detail & Related papers (2020-10-13T17:58:31Z) - Contextualized Perturbation for Textual Adversarial Attack [56.370304308573274]
Adversarial examples expose the vulnerabilities of natural language processing (NLP) models.
This paper presents CLARE, a ContextuaLized AdversaRial Example generation model that produces fluent and grammatical outputs.
arXiv Detail & Related papers (2020-09-16T06:53:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.