DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in
Interactive Autonomous Driving Agents
- URL: http://arxiv.org/abs/2210.12511v1
- Date: Sat, 22 Oct 2022 17:52:46 GMT
- Title: DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in
Interactive Autonomous Driving Agents
- Authors: Ziqiao Ma, Ben VanDerPloeg, Cristian-Paul Bara, Huang Yidong, Eui-In
Kim, Felix Gervits, Matthew Marge, Joyce Chai
- Abstract summary: We introduce Dialogue On the ROad To Handle Irregular Events (DOROTHIE), a novel interactive simulation platform.
Based on this platform, we created the Situated Dialogue Navigation (SDN), a navigation benchmark of 183 trials.
SDN is developed to evaluate the agent's ability to predict dialogue moves from humans as well as generate its own dialogue moves and physical navigation actions.
- Score: 6.639872461610685
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the real world, autonomous driving agents navigate in highly dynamic
environments full of unexpected situations where pre-trained models are
unreliable. In these situations, what is immediately available to vehicles is
often only human operators. Empowering autonomous driving agents with the
ability to navigate in a continuous and dynamic environment and to communicate
with humans through sensorimotor-grounded dialogue becomes critical. To this
end, we introduce Dialogue On the ROad To Handle Irregular Events (DOROTHIE), a
novel interactive simulation platform that enables the creation of unexpected
situations on the fly to support empirical studies on situated communication
with autonomous driving agents. Based on this platform, we created the Situated
Dialogue Navigation (SDN), a navigation benchmark of 183 trials with a total of
8415 utterances, around 18.7 hours of control streams, and 2.9 hours of trimmed
audio. SDN is developed to evaluate the agent's ability to predict dialogue
moves from humans as well as generate its own dialogue moves and physical
navigation actions. We further developed a transformer-based baseline model for
these SDN tasks. Our empirical results indicate that language guided-navigation
in a highly dynamic environment is an extremely difficult task for end-to-end
models. These results will provide insight towards future work on robust
autonomous driving agents. The DOROTHIE platform, SDN benchmark, and code for
the baseline model are available at https://github.com/sled-group/DOROTHIE.
Related papers
- EmoBipedNav: Emotion-aware Social Navigation for Bipedal Robots with Deep Reinforcement Learning [11.622119393400341]
This study presents an emotion-aware navigation framework --BipedNav -- for bipedal robots walking in socially interactive environments.
The proposed framework incorporates full-order dynamics and locomotion constraints during training, effectively accounting for tracking errors and restrictions of the locomotion controller.
arXiv Detail & Related papers (2025-03-16T15:11:57Z) - doScenes: An Autonomous Driving Dataset with Natural Language Instruction for Human Interaction and Vision-Language Navigation [0.0]
doScenes is a novel dataset designed to facilitate research on human-vehicle instruction interactions.
DoScenes bridges the gap between instruction and driving response, enabling context-aware and adaptive planning.
arXiv Detail & Related papers (2024-12-08T11:16:47Z) - Collaborative Instance Object Navigation: Leveraging Uncertainty-Awareness to Minimize Human-Agent Dialogues [54.81155589931697]
Collaborative Instance object Navigation (CoIN) is a new task setting where the agent actively resolve uncertainties about the target instance.
We propose a novel training-free method, Agent-user Interaction with UncerTainty Awareness (AIUTA)
First, upon object detection, a Self-Questioner model initiates a self-dialogue within the agent to obtain a complete and accurate observation description.
An Interaction Trigger module determines whether to ask a question to the human, continue or halt navigation.
arXiv Detail & Related papers (2024-12-02T08:16:38Z) - WHALES: A Multi-agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving [54.365702251769456]
We present dataset with unprecedented average of 8.4 agents per driving sequence.
In addition to providing the largest number of agents and viewpoints among autonomous driving datasets, WHALES records agent behaviors.
We conduct experiments on agent scheduling task, where the ego agent selects one of multiple candidate agents to cooperate with.
arXiv Detail & Related papers (2024-11-20T14:12:34Z) - DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences [12.51538076211772]
We introduce DriVLMe, a video-language-model-based agent to facilitate communication between humans and autonomous vehicles.
We demonstrate competitive performance in both open-loop benchmarks and closed-loop human studies.
arXiv Detail & Related papers (2024-06-05T07:14:44Z) - Unifying Large Language Model and Deep Reinforcement Learning for Human-in-Loop Interactive Socially-aware Navigation [16.789333617628138]
Social robot navigation planners face two major challenges: managing real-time user inputs and ensuring socially compliant behaviors.
We introduce SALM, an interactive, human-in-loop Socially-Aware navigation Large Language Model framework.
A memory mechanism archives temporal data for continuous refinement, while a multi-step graph-of-thoughts inference-based large language feedback model adaptively fuses the strengths of both planning approaches.
arXiv Detail & Related papers (2024-03-22T23:12:28Z) - DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral
Planning States for Autonomous Driving [69.82743399946371]
DriveMLM is a framework that can perform close-loop autonomous driving in realistic simulators.
We employ a multi-modal LLM (MLLM) to model the behavior planning module of a module AD system.
This model can plug-and-play in existing AD systems such as Apollo for close-loop driving.
arXiv Detail & Related papers (2023-12-14T18:59:05Z) - Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked
Vehicles [54.61668577827041]
We introduce COOPERNAUT, an end-to-end learning model that uses cross-vehicle perception for vision-based cooperative driving.
Our experiments on AutoCastSim suggest that our cooperative perception driving models lead to a 40% improvement in average success rate.
arXiv Detail & Related papers (2022-05-04T17:55:12Z) - Fully End-to-end Autonomous Driving with Semantic Depth Cloud Mapping
and Multi-Agent [2.512827436728378]
We propose a novel deep learning model trained with end-to-end and multi-task learning manners to perform both perception and control tasks simultaneously.
The model is evaluated on CARLA simulator with various scenarios made of normal-adversarial situations and different weathers to mimic real-world conditions.
arXiv Detail & Related papers (2022-04-12T03:57:01Z) - Multi-Agent Reinforcement Learning for Markov Routing Games: A New
Modeling Paradigm For Dynamic Traffic Assignment [11.093194714316434]
We develop a Markov routing game (MRG) in which each agent learns and updates her own en-route path choice policy.
We show that the routing behavior of intelligent agents is shown to converge to the classical notion of predictive dynamic user equilibrium.
arXiv Detail & Related papers (2020-11-22T02:31:14Z) - SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for
Autonomous Driving [96.50297622371457]
Multi-agent interaction is a fundamental aspect of autonomous driving in the real world.
Despite more than a decade of research and development, the problem of how to interact with diverse road users in diverse scenarios remains largely unsolved.
We develop a dedicated simulation platform called SMARTS that generates diverse and competent driving interactions.
arXiv Detail & Related papers (2020-10-19T18:26:10Z) - Intelligent Roundabout Insertion using Deep Reinforcement Learning [68.8204255655161]
We present a maneuver planning module able to negotiate the entering in busy roundabouts.
The proposed module is based on a neural network trained to predict when and how entering the roundabout throughout the whole duration of the maneuver.
arXiv Detail & Related papers (2020-01-03T11:16:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.