Uncertainty-Aware Decision Transformer for Stochastic Driving Environments
- URL: http://arxiv.org/abs/2309.16397v3
- Date: Mon, 07 Oct 2024 12:05:12 GMT
- Title: Uncertainty-Aware Decision Transformer for Stochastic Driving Environments
- Authors: Zenan Li, Fan Nie, Qiao Sun, Fang Da, Hang Zhao,
- Abstract summary: We introduce an UNcertainty-awaRESion Transformer (UNREST) for planning in driving environments.
UNREST estimates uncertainties by conditional mutual information between transitions and returns.
We replace the global returns in decision transformers with truncated returns less affected by environments to learn from actual outcomes.
- Score: 34.78461208843929
- License:
- Abstract: Offline Reinforcement Learning (RL) enables policy learning without active interactions, making it especially appealing for self-driving tasks. Recent successes of Transformers inspire casting offline RL as sequence modeling, which, however, fails in stochastic environments with incorrect assumptions that identical actions can consistently achieve the same goal. In this paper, we introduce an UNcertainty-awaRE deciSion Transformer (UNREST) for planning in stochastic driving environments without introducing additional transition or complex generative models. Specifically, UNREST estimates uncertainties by conditional mutual information between transitions and returns. Discovering 'uncertainty accumulation' and 'temporal locality' properties of driving environments, we replace the global returns in decision transformers with truncated returns less affected by environments to learn from actual outcomes of actions rather than environment transitions. We also dynamically evaluate uncertainty at inference for cautious planning. Extensive experiments demonstrate UNREST's superior performance in various driving scenarios and the power of our uncertainty estimation strategy.
Related papers
- Adversarial Safety-Critical Scenario Generation using Naturalistic Human Driving Priors [2.773055342671194]
We introduce a natural adversarial scenario generation solution using naturalistic human driving priors and reinforcement learning techniques.
Our findings demonstrate that the proposed model can generate realistic safety-critical test scenarios covering both naturalness and adversariality.
arXiv Detail & Related papers (2024-08-06T13:58:56Z) - Latent Plan Transformer for Trajectory Abstraction: Planning as Latent Space Inference [53.419249906014194]
We study generative modeling for planning with datasets repurposed from offline reinforcement learning.
We introduce the Latent Plan Transformer (), a novel model that leverages a latent variable to connect a Transformer-based trajectory generator and the final return.
arXiv Detail & Related papers (2024-02-07T08:18:09Z) - Controllable Diverse Sampling for Diffusion Based Motion Behavior
Forecasting [11.106812447960186]
We introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT)
CDT integrates information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories.
To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left.
arXiv Detail & Related papers (2024-02-06T13:16:54Z) - Dealing with uncertainty: balancing exploration and exploitation in deep
recurrent reinforcement learning [0.0]
Incomplete knowledge of the environment leads an agent to make decisions under uncertainty.
One of the major dilemmas in Reinforcement Learning (RL) where an autonomous agent has to balance two contrasting needs in making its decisions.
We show that adaptive methods better approximate the trade-off between exploration and exploitation.
arXiv Detail & Related papers (2023-10-12T13:45:33Z) - Environment Transformer and Policy Optimization for Model-Based Offline
Reinforcement Learning [25.684201757101267]
We propose an uncertainty-aware sequence modeling architecture called Environment Transformer.
Benefiting from the accurate modeling of the transition dynamics and reward function, Environment Transformer can be combined with arbitrary planning, dynamics programming, or policy optimization algorithms for offline RL.
arXiv Detail & Related papers (2023-03-07T11:26:09Z) - Augmenting Reinforcement Learning with Transformer-based Scene
Representation Learning for Decision-making of Autonomous Driving [27.84595432822612]
We propose Scene-Rep Transformer to improve the reinforcement learning decision-making capabilities.
A multi-stage Transformer (MST) encoder is constructed to model the interaction awareness between the ego vehicle and its neighbors.
A sequential latent Transformer (SLT) with self-supervised learning objectives is employed to distill the future predictive information into the latent scene representation.
arXiv Detail & Related papers (2022-08-24T08:05:18Z) - Generalizing Decision Making for Automated Driving with an Invariant
Environment Representation using Deep Reinforcement Learning [55.41644538483948]
Current approaches either do not generalize well beyond the training data or are not capable to consider a variable number of traffic participants.
We propose an invariant environment representation from the perspective of the ego vehicle.
We show that the agents are capable to generalize successfully to unseen scenarios, due to the abstraction.
arXiv Detail & Related papers (2021-02-12T20:37:29Z) - Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z) - Can Autonomous Vehicles Identify, Recover From, and Adapt to
Distribution Shifts? [104.04999499189402]
Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment.
We propose an uncertainty-aware planning method, called emphrobust imitative planning (RIP)
Our method can detect and recover from some distribution shifts, reducing the overconfident and catastrophic extrapolations in OOD scenes.
We introduce an autonomous car novel-scene benchmark, textttCARNOVEL, to evaluate the robustness of driving agents to a suite of tasks with distribution shifts.
arXiv Detail & Related papers (2020-06-26T11:07:32Z) - Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity.
Our method leverages latent variable models to learn a representation of the environment from current and past experiences.
We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.