LMPOcc: 3D Semantic Occupancy Prediction Utilizing Long-Term Memory Prior from Historical Traversals
- URL: http://arxiv.org/abs/2504.13596v1
- Date: Fri, 18 Apr 2025 09:58:48 GMT
- Title: LMPOcc: 3D Semantic Occupancy Prediction Utilizing Long-Term Memory Prior from Historical Traversals
- Authors: Shanshuai Yuan, Julong Wei, Muer Tie, Xiangyun Ren, Zhongxue Gan, Wenchao Ding,
- Abstract summary: Longterm Memory Prior Occupancy (LMPOcc) is the first 3D occupancy prediction methodology that exploits long-term memory priors derived from historical perceptual outputs.<n>We introduce a plug-and-play architecture that integrates long-term memory priors to enhance local perception while simultaneously constructing global occupancy representations.
- Score: 4.970345700893879
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Vision-based 3D semantic occupancy prediction is critical for autonomous driving, enabling unified modeling of static infrastructure and dynamic agents. In practice, autonomous vehicles may repeatedly traverse identical geographic locations under varying environmental conditions, such as weather fluctuations and illumination changes. Existing methods in 3D occupancy prediction predominantly integrate adjacent temporal contexts. However, these works neglect to leverage perceptual information, which is acquired from historical traversals of identical geographic locations. In this paper, we propose Longterm Memory Prior Occupancy (LMPOcc), the first 3D occupancy prediction methodology that exploits long-term memory priors derived from historical traversal perceptual outputs. We introduce a plug-and-play architecture that integrates long-term memory priors to enhance local perception while simultaneously constructing global occupancy representations. To adaptively aggregate prior features and current features, we develop an efficient lightweight Current-Prior Fusion module. Moreover, we propose a model-agnostic prior format to ensure compatibility across diverse occupancy prediction baselines. LMPOcc achieves state-of-the-art performance validated on the Occ3D-nuScenes benchmark, especially on static semantic categories. Additionally, experimental results demonstrate LMPOcc's ability to construct global occupancy through multi-vehicle crowdsourcing.
Related papers
- TEOcc: Radar-camera Multi-modal Occupancy Prediction via Temporal Enhancement [5.860326420490923]
We propose a radar-camera multi-modal temporal enhanced occupancy prediction network, dubbed TEOcc.
Our method is inspired by the success of utilizing temporal information in 3D object detection.
Experiment results demonstrate that TEOcc achieves state-of-the-art occupancy prediction on nuScenes benchmarks.
arXiv Detail & Related papers (2024-10-15T03:20:48Z) - OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention [76.37139809114274]
HPNet is a novel dynamic trajectory forecasting method.
We propose a Historical Prediction Attention module to automatically encode the dynamic relationship between successive predictions.
Our code is available at https://github.com/XiaolongTang23/HPNet.
arXiv Detail & Related papers (2024-04-09T14:42:31Z) - AMP: Autoregressive Motion Prediction Revisited with Next Token Prediction for Autonomous Driving [59.94343412438211]
We introduce the GPT style next token motion prediction into motion prediction.
Different from language data which is composed of homogeneous units -words, the elements in the driving scene could have complex spatial-temporal and semantic relations.
We propose to adopt three factorized attention modules with different neighbors for information aggregation and different position encoding styles to capture their relations.
arXiv Detail & Related papers (2024-03-20T06:22:37Z) - Spatial-Temporal Large Language Model for Traffic Prediction [21.69991612610926]
We propose a Spatial-Temporal Large Language Model (ST-LLM) for traffic prediction.
In the ST-LLM, we define timesteps at each location as tokens and design a spatial-temporal embedding to learn the spatial location and global temporal patterns of these tokens.
In experiments on real traffic datasets, ST-LLM is a powerful spatial-temporal learner that outperforms state-of-the-art models.
arXiv Detail & Related papers (2024-01-18T17:03:59Z) - Rethinking Urban Mobility Prediction: A Super-Multivariate Time Series
Forecasting Approach [71.67506068703314]
Long-term urban mobility predictions play a crucial role in the effective management of urban facilities and services.
Traditionally, urban mobility data has been structured as videos, treating longitude and latitude as fundamental pixels.
In our research, we introduce a fresh perspective on urban mobility prediction.
Instead of oversimplifying urban mobility data as traditional video data, we regard it as a complex time series.
arXiv Detail & Related papers (2023-12-04T07:39:05Z) - FlashOcc: Fast and Memory-Efficient Occupancy Prediction via
Channel-to-Height Plugin [32.172269679513285]
FlashOCC consolidates rapid and memory-efficient occupancy prediction.
Channel-to-height transformation is introduced to lift the output logits from the BEV into the 3D space.
Results substantiate the superiority of our plug-and-play paradigm over previous state-of-the-art methods.
arXiv Detail & Related papers (2023-11-18T15:28:09Z) - Context-aware multi-head self-attentional neural network model for next
location prediction [19.640761373993417]
We utilize a multi-head self-attentional (A) neural network that learns location patterns from historical location visits.
We demonstrate that proposed the model outperforms other state-of-the-art prediction models.
We believe that the proposed model is vital for context-aware mobility prediction.
arXiv Detail & Related papers (2022-12-04T23:40:14Z) - LOPR: Latent Occupancy PRediction using Generative Models [49.15687400958916]
LiDAR generated occupancy grid maps (L-OGMs) offer a robust bird's eye-view scene representation.
We propose a framework that decouples occupancy prediction into: representation learning and prediction within the learned latent space.
arXiv Detail & Related papers (2022-10-03T22:04:00Z) - Predicting Future Occupancy Grids in Dynamic Environment with
Spatio-Temporal Learning [63.25627328308978]
We propose a-temporal prediction network pipeline to generate future occupancy predictions.
Compared to current SOTA, our approach predicts occupancy for a longer horizon of 3 seconds.
We publicly release our grid occupancy dataset based on nulis to support further research.
arXiv Detail & Related papers (2022-05-06T13:45:32Z) - Physically constrained short-term vehicle trajectory forecasting with
naive semantic maps [6.85316573653194]
We propose a model that learns to extract relevant road features from semantic maps as well as general motion of agents.
We show that our model is not only capable of anticipating future motion whilst taking into consideration road boundaries, but can also effectively and precisely predict trajectories for a longer time horizon than initially trained for.
arXiv Detail & Related papers (2020-06-09T09:52:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.