Related papers: Self-supervised Multi-future Occupancy Forecasting for Autonomous Driving

Self-supervised Multi-future Occupancy Forecasting for Autonomous Driving

URL: http://arxiv.org/abs/2407.21126v1
Date: Tue, 30 Jul 2024 18:37:59 GMT
Title: Self-supervised Multi-future Occupancy Forecasting for Autonomous Driving
Authors: Bernard Lange, Masha Itkina, Jiachen Li, Mykel J. Kochenderfer,
Abstract summary: LiDAR-generated occupancy grid maps (L-OGMs) offer a robust bird's-eye view for the scene representation. Our proposed framework performs L-OGM prediction in the latent space of a generative architecture. We decode predictions using either a single-step decoder, which provides high-quality predictions in real-time, or a diffusion-based batch decoder.
Score: 45.886941596233974
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Environment prediction frameworks are critical for the safe navigation of autonomous vehicles (AVs) in dynamic settings. LiDAR-generated occupancy grid maps (L-OGMs) offer a robust bird's-eye view for the scene representation, enabling self-supervised joint scene predictions while exhibiting resilience to partial observability and perception detection failures. Prior approaches have focused on deterministic L-OGM prediction architectures within the grid cell space. While these methods have seen some success, they frequently produce unrealistic predictions and fail to capture the stochastic nature of the environment. Additionally, they do not effectively integrate additional sensor modalities present in AVs. Our proposed framework performs stochastic L-OGM prediction in the latent space of a generative architecture and allows for conditioning on RGB cameras, maps, and planned trajectories. We decode predictions using either a single-step decoder, which provides high-quality predictions in real-time, or a diffusion-based batch decoder, which can further refine the decoded frames to address temporal consistency issues and reduce compression losses. Our experiments on the nuScenes and Waymo Open datasets show that all variants of our approach qualitatively and quantitatively outperform prior approaches.

Related papers

Foresight in Motion: Reinforcing Trajectory Prediction with Reward Heuristics [34.570579623171476]
"First Reasoning, Then Forecasting" is a strategy that explicitly incorporates behavior intentions as spatial guidance for trajectory prediction.<n>We introduce an interpretable, reward-driven intention reasoner grounded in a novel query-centric Inverse Reinforcement Learning scheme.<n>Our approach significantly enhances trajectory prediction confidence, achieving highly competitive performance relative to state-of-the-art methods.
arXiv Detail & Related papers (2025-07-16T09:46:17Z)
Trajectory Mamba: Efficient Attention-Mamba Forecasting Model Based on Selective SSM [16.532357621144342]
This paper introduces Trajectory Mamba, a novel efficient trajectory prediction framework based on the selective state-space model (SSM) To address the potential reduction in prediction accuracy resulting from modifications to the attention mechanism, we propose a joint polyline encoding strategy. Our model achieves state-of-the-art results in terms of inference speed and parameter efficiency on both the Argoverse 1 and Argoverse 2 datasets.
arXiv Detail & Related papers (2025-03-13T21:31:12Z)
Fast and Efficient Transformer-based Method for Bird's Eye View Instance Prediction [0.8458547573621331]
This paper introduces a novel BEV instance prediction architecture based on a simplified paradigm. The proposed system prioritizes speed, aiming at reduced parameter counts and inference times. implementation of the proposed architecture is optimized for performance improvements in PyTorch version 2.1.
arXiv Detail & Related papers (2024-11-11T10:35:23Z)
OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries. OPUS incorporates a suite of non-trivial strategies to enhance model performance. Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z)
A Novel Deep Neural Network for Trajectory Prediction in Automated Vehicles Using Velocity Vector Field [12.067838086415833]
This paper proposes a novel technique for trajectory prediction that combines a data-driven learning-based method with a velocity vector field (VVF) generated from a nature-inspired concept. The accuracy remains consistent with decreasing observation windows which alleviates the requirement of a long history of past observations for accurate trajectory prediction.
arXiv Detail & Related papers (2023-09-19T22:14:52Z)
Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants. Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene. This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z)
LOPR: Latent Occupancy PRediction using Generative Models [49.15687400958916]
LiDAR generated occupancy grid maps (L-OGMs) offer a robust bird's eye-view scene representation. We propose a framework that decouples occupancy prediction into: representation learning and prediction within the learned latent space.
arXiv Detail & Related papers (2022-10-03T22:04:00Z)
Conditioned Human Trajectory Prediction using Iterative Attention Blocks [70.36888514074022]
We present a simple yet effective pedestrian trajectory prediction model aimed at pedestrians positions prediction in urban-like environments. Our model is a neural-based architecture that can run several layers of attention blocks and transformers in an iterative sequential fashion. We show that without explicit introduction of social masks, dynamical models, social pooling layers, or complicated graph-like structures, it is possible to produce on par results with SoTA models.
arXiv Detail & Related papers (2022-06-29T07:49:48Z)
SLPC: a VRNN-based approach for stochastic lidar prediction and completion in autonomous driving [63.87272273293804]
We propose a new LiDAR prediction framework that is based on generative models namely Variational Recurrent Neural Networks (VRNNs) Our algorithm is able to address the limitations of previous video prediction frameworks when dealing with sparse data by spatially inpainting the depth maps in the upcoming frames. We present a sparse version of VRNNs and an effective self-supervised training method that does not require any labels.
arXiv Detail & Related papers (2021-02-19T11:56:44Z)
Spatio-Temporal Graph Dual-Attention Network for Multi-Agent Prediction and Tracking [23.608125748229174]
We propose a generic generative neural system for multi-agent trajectory prediction involving heterogeneous agents. The proposed system is evaluated on three public benchmark datasets for trajectory prediction.
arXiv Detail & Related papers (2021-02-18T02:25:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.