Related papers: Uncertainty-Aware Decision Transformer for Stochastic Driving Environments

Uncertainty-Aware Decision Transformer for Stochastic Driving Environments

URL: http://arxiv.org/abs/2309.16397v2
Date: Fri, 17 Nov 2023 05:41:45 GMT
Title: Uncertainty-Aware Decision Transformer for Stochastic Driving Environments
Authors: Zenan Li, Fan Nie, Qiao Sun, Fang Da, Hang Zhao
Abstract summary: We introduce an UNcertainty-awaRESion Transformer (UNREST) for planning in driving environments. UNREST estimates uncertainties by the conditional mutual information between transitions and returns. We show UNREST's superior performance in various driving scenarios and the power of our uncertainty estimation strategy.
Score: 37.31853034449015
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Offline Reinforcement Learning (RL) has emerged as a promising framework for learning policies without active interactions, making it especially appealing for autonomous driving tasks. Recent successes of Transformers inspire casting offline RL as sequence modeling, which performs well in long-horizon tasks. However, they are overly optimistic in stochastic environments with incorrect assumptions that the same goal can be consistently achieved by identical actions. In this paper, we introduce an UNcertainty-awaRE deciSion Transformer (UNREST) for planning in stochastic driving environments without introducing additional transition or complex generative models. Specifically, UNREST estimates state uncertainties by the conditional mutual information between transitions and returns, and segments sequences accordingly. Discovering the `uncertainty accumulation' and `temporal locality' properties of driving environments, UNREST replaces the global returns in decision transformers with less uncertain truncated returns, to learn from true outcomes of agent actions rather than environment transitions. We also dynamically evaluate environmental uncertainty during inference for cautious planning. Extensive experimental results demonstrate UNREST's superior performance in various driving scenarios and the power of our uncertainty estimation strategy.

Related papers

Active Test-time Vision-Language Navigation [60.69722522420299]
ATENA is a test-time active learning framework that enables a practical human-robot interaction via episodic feedback on uncertain navigation outcomes.<n>In particular, ATENA learns to increase certainty in successful episodes and decrease it in failed ones, improving uncertainty calibration.<n>In addition, we propose a self-active learning strategy that enables an agent to evaluate its navigation outcomes based on confident predictions.
arXiv Detail & Related papers (2025-06-07T02:24:44Z)
RADE: Learning Risk-Adjustable Driving Environment via Multi-Agent Conditional Diffusion [17.46462636610847]
Risk- Driving Environment (RADE) is a simulation framework that generates statistically realistic and risk-adjustable traffic scenes.<n>RADE learns risk-conditioned behaviors directly from data, preserving naturalistic multi-agent interactions with controllable risk levels.<n>We validate RADE on the real-world rounD dataset, demonstrating that it preserves statistical realism across varying risk levels.
arXiv Detail & Related papers (2025-05-06T04:41:20Z)
Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator [50.191655141020505]
Reinforcement Learning (RL) has demonstrated impressive capabilities in robotic control but remains challenging due to high sample complexity, safety concerns, and the sim-to-real gap. We introduce Offline Robotic World Model (RWM-O), a model-based approach that explicitly estimates uncertainty to improve policy learning without reliance on a physics simulator.
arXiv Detail & Related papers (2025-04-23T12:58:15Z)
Adversarial Safety-Critical Scenario Generation using Naturalistic Human Driving Priors [2.773055342671194]
We introduce a natural adversarial scenario generation solution using naturalistic human driving priors and reinforcement learning techniques. Our findings demonstrate that the proposed model can generate realistic safety-critical test scenarios covering both naturalness and adversariality.
arXiv Detail & Related papers (2024-08-06T13:58:56Z)
Latent Plan Transformer for Trajectory Abstraction: Planning as Latent Space Inference [53.419249906014194]
We study generative modeling for planning with datasets repurposed from offline reinforcement learning. We introduce the Latent Plan Transformer (), a novel model that leverages a latent variable to connect a Transformer-based trajectory generator and the final return.
arXiv Detail & Related papers (2024-02-07T08:18:09Z)
Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting [11.106812447960186]
We introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT) CDT integrates information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories. To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left.
arXiv Detail & Related papers (2024-02-06T13:16:54Z)
Dealing with uncertainty: balancing exploration and exploitation in deep recurrent reinforcement learning [0.0]
Incomplete knowledge of the environment leads an agent to make decisions under uncertainty. One of the major dilemmas in Reinforcement Learning (RL) where an autonomous agent has to balance two contrasting needs in making its decisions. We show that adaptive methods better approximate the trade-off between exploration and exploitation.
arXiv Detail & Related papers (2023-10-12T13:45:33Z)
Environment Transformer and Policy Optimization for Model-Based Offline Reinforcement Learning [25.684201757101267]
We propose an uncertainty-aware sequence modeling architecture called Environment Transformer. Benefiting from the accurate modeling of the transition dynamics and reward function, Environment Transformer can be combined with arbitrary planning, dynamics programming, or policy optimization algorithms for offline RL.
arXiv Detail & Related papers (2023-03-07T11:26:09Z)
Augmenting Reinforcement Learning with Transformer-based Scene Representation Learning for Decision-making of Autonomous Driving [27.84595432822612]
We propose Scene-Rep Transformer to improve the reinforcement learning decision-making capabilities. A multi-stage Transformer (MST) encoder is constructed to model the interaction awareness between the ego vehicle and its neighbors. A sequential latent Transformer (SLT) with self-supervised learning objectives is employed to distill the future predictive information into the latent scene representation.
arXiv Detail & Related papers (2022-08-24T08:05:18Z)
Generalizing Decision Making for Automated Driving with an Invariant Environment Representation using Deep Reinforcement Learning [55.41644538483948]
Current approaches either do not generalize well beyond the training data or are not capable to consider a variable number of traffic participants. We propose an invariant environment representation from the perspective of the ego vehicle. We show that the agents are capable to generalize successfully to unseen scenarios, due to the abstraction.
arXiv Detail & Related papers (2021-02-12T20:37:29Z)
Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous. We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments. We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z)
Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts? [104.04999499189402]
Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment. We propose an uncertainty-aware planning method, called emphrobust imitative planning (RIP) Our method can detect and recover from some distribution shifts, reducing the overconfident and catastrophic extrapolations in OOD scenes. We introduce an autonomous car novel-scene benchmark, textttCARNOVEL, to evaluate the robustness of driving agents to a suite of tasks with distribution shifts.
arXiv Detail & Related papers (2020-06-26T11:07:32Z)
Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity. Our method leverages latent variable models to learn a representation of the environment from current and past experiences. We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.