LiSTAR: Ray-Centric World Models for 4D LiDAR Sequences in Autonomous Driving
- URL: http://arxiv.org/abs/2511.16049v1
- Date: Thu, 20 Nov 2025 05:11:22 GMT
- Title: LiSTAR: Ray-Centric World Models for 4D LiDAR Sequences in Autonomous Driving
- Authors: Pei Liu, Songtao Wang, Lang Zhang, Xingyue Peng, Yuandong Lyu, Jiaxin Deng, Songxin Lu, Weiliang Ma, Xueyang Zhang, Yifei Zhan, XianPeng Lang, Jun Ma,
- Abstract summary: LiSTAR is a novel generative world model that operates directly on the sensor's native geometry.<n>LiSTAR captures complex dynamics from sparse temporal data.<n>Experiments validate LiSTAR's performance across 4D LiDAR reconstruction, prediction, and conditional generation.
- Score: 8.465161411966761
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Synthesizing high-fidelity and controllable 4D LiDAR data is crucial for creating scalable simulation environments for autonomous driving. This task is inherently challenging due to the sensor's unique spherical geometry, the temporal sparsity of point clouds, and the complexity of dynamic scenes. To address these challenges, we present LiSTAR, a novel generative world model that operates directly on the sensor's native geometry. LiSTAR introduces a Hybrid-Cylindrical-Spherical (HCS) representation to preserve data fidelity by mitigating quantization artifacts common in Cartesian grids. To capture complex dynamics from sparse temporal data, it utilizes a Spatio-Temporal Attention with Ray-Centric Transformer (START) that explicitly models feature evolution along individual sensor rays for robust temporal coherence. Furthermore, for controllable synthesis, we propose a novel 4D point cloud-aligned voxel layout for conditioning and a corresponding discrete Masked Generative START (MaskSTART) framework, which learns a compact, tokenized representation of the scene, enabling efficient, high-resolution, and layout-guided compositional generation. Comprehensive experiments validate LiSTAR's state-of-the-art performance across 4D LiDAR reconstruction, prediction, and conditional generation, with substantial quantitative gains: reducing generation MMD by a massive 76%, improving reconstruction IoU by 32%, and lowering prediction L1 Med by 50%. This level of performance provides a powerful new foundation for creating realistic and controllable autonomous systems simulations. Project link: https://ocean-luna.github.io/LiSTAR.gitub.io.
Related papers
- U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences [54.77163447282599]
Existing generative frameworks treat all spatial regions uniformly, overlooking the varying uncertainty across real-world scenes.<n>We present U4D, an uncertainty-aware framework for 4D LiDAR world modeling.<n>Our approach first estimates spatial uncertainty maps from a pretrained segmentation model to localize semantically challenging regions.<n>It then performs generation in a "hard-to-easy" manner through two sequential stages: (1) uncertainty-region modeling, which reconstructs high-entropy regions with fine geometric fidelity, and (2) uncertainty-conditioned completion, which synthesizes the remaining areas under learned structural priors.
arXiv Detail & Related papers (2025-12-02T17:59:57Z) - LaGen: Towards Autoregressive LiDAR Scene Generation [66.95324368583536]
We introduce LaGen, which to the best of our knowledge is the first framework capable of frame-by-frame autoregressive generation of long-horizon LiDAR scenes.<n>LaGen is able to take a single-frame LiDAR input as a starting point and effectively utilize bounding box information as conditions to generate high-fidelity 4D scene point clouds.
arXiv Detail & Related papers (2025-11-26T10:39:16Z) - Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method [54.461213497603154]
Occupancy-centric methods have recently achieved state-of-the-art results by offering consistent conditioning across frames and modalities.<n>Nuplan-Occ is the largest occupancy dataset to date, constructed from the widely used Nuplan benchmark.<n>We develop a unified framework that jointly synthesizes high-quality occupancy, multi-view videos, and LiDAR point clouds.
arXiv Detail & Related papers (2025-10-27T03:52:45Z) - Learning to Generate 4D LiDAR Sequences [28.411253849111755]
We present LiDARCrafter, a unified framework that converts free-form language into editable LiDAR sequences.<n>LiDARCrafter achieves state-of-the-art fidelity, controllability, and temporal consistency, offering a foundation for LiDAR-based simulation and data augmentation.
arXiv Detail & Related papers (2025-09-15T14:14:48Z) - LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences [28.411253849111755]
LiDARCrafter is a unified framework for 4D LiDAR generation and editing.<n>It achieves state-of-the-art performance in fidelity, controllability, and temporal consistency across all levels.<n>The code and benchmark are released to the community.
arXiv Detail & Related papers (2025-08-05T17:59:56Z) - GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting [3.376357029373187]
GS-LiDAR is a novel framework for generating realistic LiDAR point clouds with panoramic Gaussian splatting.<n>We introduce a novel panoramic rendering technique with explicit ray-splat intersection, guided by panoramic LiDAR supervision.
arXiv Detail & Related papers (2025-01-22T11:21:20Z) - LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [53.58528891081709]
We present LiDAR-GS, a real-time, high-fidelity re-simulation of LiDAR scans in public urban road scenes.<n>The method achieves state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z) - Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata [70.9375320609781]
We aim to generate fine-grained 3D geometry from large-scale sparse LiDAR scans, abundantly captured by autonomous vehicles (AV)
We propose hierarchical Generative Cellular Automata (hGCA), a spatially scalable 3D generative model, which grows geometry with local kernels following, in a coarse-to-fine manner, equipped with a light-weight planner to induce global consistency.
arXiv Detail & Related papers (2024-06-12T14:56:56Z) - LidarDM: Generative LiDAR Simulation in a Generated World [21.343346521878864]
LidarDM is a novel LiDAR generative model capable of producing realistic, layout-aware, physically plausible, and temporally coherent LiDAR videos.
We employ latent diffusion models to generate the 3D scene, combine it with dynamic actors to form the underlying 4D world, and subsequently produce realistic sensory observations within this virtual environment.
Our experiments indicate that our approach outperforms competing algorithms in realism, temporal coherency, and layout consistency.
arXiv Detail & Related papers (2024-04-03T17:59:28Z) - LiDAR4D: Dynamic Neural Fields for Novel Space-time View LiDAR Synthesis [11.395101473757443]
We propose LiDAR4D, a differentiable LiDAR-only framework for novel space-time LiDAR view synthesis.
In consideration of the sparsity and large-scale characteristics, we design a 4D hybrid representation combined with multi-planar and grid features.
For the realistic synthesis of LiDAR point clouds, we incorporate the global optimization of ray-drop probability to preserve cross-region patterns.
arXiv Detail & Related papers (2024-04-03T13:39:29Z) - StarNet: Style-Aware 3D Point Cloud Generation [82.30389817015877]
StarNet is able to reconstruct and generate high-fidelity and even 3D point clouds using a mapping network.
Our framework achieves comparable state-of-the-art performance on various metrics in the point cloud reconstruction and generation tasks.
arXiv Detail & Related papers (2023-03-28T08:21:44Z) - Learning to Simulate Realistic LiDARs [66.7519667383175]
We introduce a pipeline for data-driven simulation of a realistic LiDAR sensor.
We show that our model can learn to encode realistic effects such as dropped points on transparent surfaces.
We use our technique to learn models of two distinct LiDAR sensors and use them to improve simulated LiDAR data accordingly.
arXiv Detail & Related papers (2022-09-22T13:12:54Z) - LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods in both tasks.
We extend DS-Net to 4D panoptic LiDAR segmentation by the temporally unified instance clustering on aligned LiDAR frames.
arXiv Detail & Related papers (2022-03-14T15:25:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.