FormulaZero: Distributionally Robust Online Adaptation via Offline
Population Synthesis
- URL: http://arxiv.org/abs/2003.03900v2
- Date: Sat, 22 Aug 2020 17:00:39 GMT
- Title: FormulaZero: Distributionally Robust Online Adaptation via Offline
Population Synthesis
- Authors: Aman Sinha, Matthew O'Kelly, Hongrui Zheng, Rahul Mangharam, John
Duchi, Russ Tedrake
- Abstract summary: autonomous racing is a domain that penalizes safe but conservative policies.
Current approaches either make simplifying assumptions about other agents or lack robust mechanisms for online adaptation.
We develop a novel method for self-play based on replica-exchange Markov chain Monte Carlo.
- Score: 34.07399367947566
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Balancing performance and safety is crucial to deploying autonomous vehicles
in multi-agent environments. In particular, autonomous racing is a domain that
penalizes safe but conservative policies, highlighting the need for robust,
adaptive strategies. Current approaches either make simplifying assumptions
about other agents or lack robust mechanisms for online adaptation. This work
makes algorithmic contributions to both challenges. First, to generate a
realistic, diverse set of opponents, we develop a novel method for self-play
based on replica-exchange Markov chain Monte Carlo. Second, we propose a
distributionally robust bandit optimization procedure that adaptively adjusts
risk aversion relative to uncertainty in beliefs about opponents' behaviors. We
rigorously quantify the tradeoffs in performance and robustness when
approximating these computations in real-time motion-planning, and we
demonstrate our methods experimentally on autonomous vehicles that achieve
scaled speeds comparable to Formula One racecars.
Related papers
- CRASH: Challenging Reinforcement-Learning Based Adversarial Scenarios For Safety Hardening [16.305837225117607]
This paper introduces CRASH - Challenging Reinforcement-learning based Adversarial scenarios for Safety Hardening.
First CRASH can control adversarial Non Player Character (NPC) agents in an AV simulator to automatically induce collisions with the Ego vehicle.
We also propose a novel approach, that we term safety hardening, which iteratively refines the motion planner by simulating improvement scenarios against adversarial agents.
arXiv Detail & Related papers (2024-11-26T00:00:27Z) - SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework.
Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations.
We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z) - Parameterized Decision-making with Multi-modal Perception for Autonomous
Driving [12.21578713219778]
We propose a parameterized decision-making framework with multi-modal perception based on deep reinforcement learning, called AUTO.
A hybrid reward function takes into account aspects of safety, traffic efficiency, passenger comfort, and impact to guide the framework to generate optimal actions.
arXiv Detail & Related papers (2023-12-19T08:27:02Z) - Conformal Policy Learning for Sensorimotor Control Under Distribution
Shifts [61.929388479847525]
This paper focuses on the problem of detecting and reacting to changes in the distribution of a sensorimotor controller's observables.
The key idea is the design of switching policies that can take conformal quantiles as input.
We show how to design such policies by using conformal quantiles to switch between base policies with different characteristics.
arXiv Detail & Related papers (2023-11-02T17:59:30Z) - NeurIPS 2022 Competition: Driving SMARTS [60.948652154552136]
Driving SMARTS is a regular competition designed to tackle problems caused by the distribution shift in dynamic interaction contexts.
The proposed competition supports methodologically diverse solutions, such as reinforcement learning (RL) and offline learning methods.
arXiv Detail & Related papers (2022-11-14T17:10:53Z) - Decision-Making under On-Ramp merge Scenarios by Distributional Soft
Actor-Critic Algorithm [10.258474373022075]
We propose an RL-based end-to-end decision-making method under a framework of offline training and online correction, called the Shielded Distributional Soft Actor-critic (SDSAC)
The results show that the SDSAC has the best safety performance compared to baseline algorithms and efficient driving simultaneously.
arXiv Detail & Related papers (2021-03-08T03:57:32Z) - Deep Structured Reactive Planning [94.92994828905984]
We propose a novel data-driven, reactive planning objective for self-driving vehicles.
We show that our model outperforms a non-reactive variant in successfully completing highly complex maneuvers.
arXiv Detail & Related papers (2021-01-18T01:43:36Z) - Learning from Simulation, Racing in Reality [126.56346065780895]
We present a reinforcement learning-based solution to autonomously race on a miniature race car platform.
We show that a policy that is trained purely in simulation can be successfully transferred to the real robotic setup.
arXiv Detail & Related papers (2020-11-26T14:58:49Z) - Towards a Systematic Computational Framework for Modeling Multi-Agent
Decision-Making at Micro Level for Smart Vehicles in a Smart World [8.899670429041453]
We propose a multi-agent based computational framework for modeling decision-making and strategic interaction at micro level for smart vehicles.
Our aim is to make the framework conceptually sound and practical for a range of realistic applications, including micro path planning for autonomous vehicles.
arXiv Detail & Related papers (2020-09-25T13:05:28Z) - Can Autonomous Vehicles Identify, Recover From, and Adapt to
Distribution Shifts? [104.04999499189402]
Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment.
We propose an uncertainty-aware planning method, called emphrobust imitative planning (RIP)
Our method can detect and recover from some distribution shifts, reducing the overconfident and catastrophic extrapolations in OOD scenes.
We introduce an autonomous car novel-scene benchmark, textttCARNOVEL, to evaluate the robustness of driving agents to a suite of tasks with distribution shifts.
arXiv Detail & Related papers (2020-06-26T11:07:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.