Related papers: Dialogue-based generation of self-driving simulation scenarios using Large Language Models

Dialogue-based generation of self-driving simulation scenarios using Large Language Models

URL: http://arxiv.org/abs/2310.17372v1
Date: Thu, 26 Oct 2023 13:07:01 GMT
Title: Dialogue-based generation of self-driving simulation scenarios using Large Language Models
Authors: Antonio Valerio Miceli-Barone, Alex Lascarides, Craig Innes
Abstract summary: Simulation is an invaluable tool for developing and evaluating controllers for self-driving cars. Current simulation frameworks are driven by highly-specialist domain specific languages. There is often a gap between a concise English utterance and the executable code that captures the user's intent.
Score: 14.86435467709869
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Simulation is an invaluable tool for developing and evaluating controllers for self-driving cars. Current simulation frameworks are driven by highly-specialist domain specific languages, and so a natural language interface would greatly enhance usability. But there is often a gap, consisting of tacit assumptions the user is making, between a concise English utterance and the executable code that captures the user's intent. In this paper we describe a system that addresses this issue by supporting an extended multimodal interaction: the user can follow up prior instructions with refinements or revisions, in reaction to the simulations that have been generated from their utterances so far. We use Large Language Models (LLMs) to map the user's English utterances in this interaction into domain-specific code, and so we explore the extent to which LLMs capture the context sensitivity that's necessary for computing the speaker's intended message in discourse.

Related papers

LANGTRAJ: Diffusion Model and Dataset for Language-Conditioned Trajectory Simulation [94.84458417662404]
LangTraj is a language-conditioned scene-diffusion model that simulates the joint behavior of all agents in traffic scenarios. By conditioning on natural language inputs, LangTraj provides flexible and intuitive control over interactive behaviors. LangTraj demonstrates strong performance in realism, language controllability, and language-conditioned safety-critical simulation.
arXiv Detail & Related papers (2025-04-15T17:14:06Z)
Generating Driving Simulations via Conversation [20.757088470174452]
We design a natural language interface to assist a non-coding domain expert in synthesising the desired scenarios and vehicle behaviours. We show that using it to convert utterances to the symbolic program is feasible, despite the very small training dataset. Human experiments show that dialogue is critical to successful simulation generation, leading to a 4.5 times higher success rate than a generation without engaging in extended conversation.
arXiv Detail & Related papers (2024-10-13T13:07:31Z)
Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions [68.98811048970963]
We present a pioneering effort to investigate the capability of large language models (LLMs) in transcribing speech in multi-talker environments. Our approach utilizes WavLM and Whisper encoder to extract multi-faceted speech representations that are sensitive to speaker characteristics and semantic context. Comprehensive experiments reveal the promising performance of our proposed system, MT-LLM, in cocktail party scenarios.
arXiv Detail & Related papers (2024-09-13T07:28:28Z)
Integrating Disambiguation and User Preferences into Large Language Models for Robot Motion Planning [1.9912315834033756]
framework can interpret humans' navigation commands containing temporal elements and translate their natural language instructions into robot motion planning. We propose methods to resolve the ambiguity in natural language instructions and capture user preferences.
arXiv Detail & Related papers (2024-04-22T19:38:37Z)
Large Language User Interfaces: Voice Interactive User Interfaces powered by LLMs [5.06113628525842]
We present a framework that can serve as an intermediary between a user and their user interface (UI) We employ a system that stands upon textual semantic mappings of UI components, in the form of annotations. Our engine can classify the most appropriate application, extract relevant parameters, and subsequently execute precise predictions of the user's expected actions.
arXiv Detail & Related papers (2024-02-07T21:08:49Z)
MEIA: Multimodal Embodied Perception and Interaction in Unknown Environments [82.67236400004826]
We introduce the Multimodal Embodied Interactive Agent (MEIA), capable of translating high-level tasks expressed in natural language into a sequence of executable actions. MEM module enables MEIA to generate executable action plans based on diverse requirements and the robot's capabilities.
arXiv Detail & Related papers (2024-02-01T02:43:20Z)
Interpreting User Requests in the Context of Natural Language Standing Instructions [89.12540932734476]
We develop NLSI, a language-to-program dataset consisting of over 2.4K dialogues spanning 17 domains. A key challenge in NLSI is to identify which subset of the standing instructions is applicable to a given dialogue.
arXiv Detail & Related papers (2023-11-16T11:19:26Z)
Natural Language based Context Modeling and Reasoning for Ubiquitous Computing with Large Language Models: A Tutorial [35.743576799998564]
Large language models (LLMs) have become phenomenally surging, since 2018--two decades after introducing context-aware computing. In this tutorial, we demonstrate the use of texts, prompts, and autonomous agents (AutoAgents) that enable LLMs to perform context modeling and reasoning.
arXiv Detail & Related papers (2023-09-24T00:15:39Z)
AmadeusGPT: a natural language interface for interactive animal behavioral analysis [65.55906175884748]
We introduce AmadeusGPT: a natural language interface that turns natural language descriptions of behaviors into machine-executable code. We show we can produce state-of-the-art performance on the MABE 2022 behavior challenge tasks. AmadeusGPT presents a novel way to merge deep biological knowledge, large-language models, and core computer vision modules into a more naturally intelligent system.
arXiv Detail & Related papers (2023-07-10T19:15:17Z)
In-Context Learning User Simulators for Task-Oriented Dialog Systems [1.7086737326992172]
This paper presents a novel application of large language models in user simulation for task-oriented dialog systems. By harnessing the power of these models, the proposed approach generates diverse utterances based on user goals and limited dialog examples.
arXiv Detail & Related papers (2023-06-01T15:06:11Z)
PADL: Language-Directed Physics-Based Character Control [66.517142635815]
We present PADL, which allows users to issue natural language commands for specifying high-level tasks and low-level skills that a character should perform. We show that our framework can be applied to effectively direct a simulated humanoid character to perform a diverse array of complex motor skills.
arXiv Detail & Related papers (2023-01-31T18:59:22Z)
Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings. We demonstrate that this framework enables effective generalization across different environments. For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.