Related papers: LaGen: Towards Autoregressive LiDAR Scene Generation

LaGen: Towards Autoregressive LiDAR Scene Generation

URL: http://arxiv.org/abs/2511.21256v1
Date: Wed, 26 Nov 2025 10:39:16 GMT
Title: LaGen: Towards Autoregressive LiDAR Scene Generation
Authors: Sizhuo Zhou, Xiaosong Jia, Fanrui Zhang, Junjie Li, Juyong Zhang, Yukang Feng, Jianwen Sun, Songbur Wong, Junqi You, Junchi Yan,
Abstract summary: We introduce LaGen, which to the best of our knowledge is the first framework capable of frame-by-frame autoregressive generation of long-horizon LiDAR scenes.<n>LaGen is able to take a single-frame LiDAR input as a starting point and effectively utilize bounding box information as conditions to generate high-fidelity 4D scene point clouds.
Score: 66.95324368583536
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative world models for autonomous driving (AD) have become a trending topic. Unlike the widely studied image modality, in this work we explore generative world models for LiDAR data. Existing generation methods for LiDAR data only support single frame generation, while existing prediction approaches require multiple frames of historical input and can only deterministically predict multiple frames at once, lacking interactivity. Both paradigms fail to support long-horizon interactive generation. To this end, we introduce LaGen, which to the best of our knowledge is the first framework capable of frame-by-frame autoregressive generation of long-horizon LiDAR scenes. LaGen is able to take a single-frame LiDAR input as a starting point and effectively utilize bounding box information as conditions to generate high-fidelity 4D scene point clouds. In addition, we introduce a scene decoupling estimation module to enhance the model's interactive generation capability for object-level content, as well as a noise modulation module to mitigate error accumulation during long-horizon generation. We construct a protocol based on nuScenes for evaluating long-horizon LiDAR scene generation. Experimental results comprehensively demonstrate LaGen outperforms state-of-the-art LiDAR generation and prediction models, especially on the later frames.

Related papers

DuoGen: Towards General Purpose Interleaved Multimodal Generation [65.13479486098419]
DuoGen is a general-purpose interleaved generation framework that addresses data curation, architecture design, and evaluation.<n>We build a large-scale, high-quality instruction-tuning dataset by combining multimodal conversations rewritten from curated raw websites.<n>A two-stage decoupled strategy first instruction-tunes the MLLM, then aligns DiT with it using curated interleaved image-text sequences.
arXiv Detail & Related papers (2026-01-31T04:35:15Z)
LiDAR-GS++:Improving LiDAR Gaussian Reconstruction via Diffusion Priors [51.724649822336346]
We present LiDAR-GS++, a reconstruction method enhanced by diffusion priors for real-time and high-fidelity re-simulation.<n>Specifically, we introduce a controllable LiDAR generation model conditioned on coarsely extrapolated rendering to produce extra geometry-consistent scans.<n>By extending reconstruction to under-fitted regions, our approach ensures global geometric consistency for extrapolative novel views.
arXiv Detail & Related papers (2025-11-15T17:33:12Z)
Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method [54.461213497603154]
Occupancy-centric methods have recently achieved state-of-the-art results by offering consistent conditioning across frames and modalities.<n>Nuplan-Occ is the largest occupancy dataset to date, constructed from the widely used Nuplan benchmark.<n>We develop a unified framework that jointly synthesizes high-quality occupancy, multi-view videos, and LiDAR point clouds.
arXiv Detail & Related papers (2025-10-27T03:52:45Z)
Learning to Generate 4D LiDAR Sequences [28.411253849111755]
We present LiDARCrafter, a unified framework that converts free-form language into editable LiDAR sequences.<n>LiDARCrafter achieves state-of-the-art fidelity, controllability, and temporal consistency, offering a foundation for LiDAR-based simulation and data augmentation.
arXiv Detail & Related papers (2025-09-15T14:14:48Z)
La La LiDAR: Large-Scale Layout Generation from LiDAR Data [45.5317990948996]
Controllable generation of realistic LiDAR scenes is crucial for applications such as autonomous driving and robotics.<n>We propose Large-scale Layout-guided LiDAR generation model ("La La LiDAR"), a novel layout-guided generative framework.<n>La La LiDAR achieves state-of-the-art performance in both LiDAR generation and downstream perception tasks.
arXiv Detail & Related papers (2025-08-05T17:59:55Z)
TopoLiDM: Topology-Aware LiDAR Diffusion Models for Interpretable and Realistic LiDAR Point Cloud Generation [15.223634903890863]
TopoLiDM is a novel framework that integrates graph neural networks with diffusion models under topological regularization for high-fidelity LiDAR generation.<n>Our approach first trains a topological-preserving VAE to extract latent graph representations by graph construction and multiple graph convolutional layers.<n>Extensive experiments on the KITTI-360 dataset demonstrate TopoLiDM's superiority over state-of-the-art methods.
arXiv Detail & Related papers (2025-07-30T08:02:42Z)
LOGen: Toward Lidar Object Generation by Point Diffusion [12.984380275928752]
We consider the task of LiDAR object generation, requiring models to produce 3D objects as viewed by a LiDAR scan.<n>We introduce a novel diffusion-based model to produce LiDAR point clouds of dataset objects, including intensity.<n>Our experiments on nuScenes and KITTI-360 show the quality of our generations measured with new 3D metrics developed to suit LiDAR objects.
arXiv Detail & Related papers (2024-12-10T10:30:27Z)
LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [53.58528891081709]
We present LiDAR-GS, a real-time, high-fidelity re-simulation of LiDAR scans in public urban road scenes.<n>The method achieves state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z)
Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models [77.47505141269035]
Generative Visual Prompt (PromptGen) is a framework for distributional control over pre-trained generative models. PromptGen approximats an energy-based model (EBM) and samples images in a feed-forward manner. Code is available at https://github.com/ChenWu98/Generative-Visual-Prompt.
arXiv Detail & Related papers (2022-09-14T22:55:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.