Related papers: FreeGen: Feed-Forward Reconstruction-Generation Co-Training for Free-Viewpoint Driving Scene Synthesis

FreeGen: Feed-Forward Reconstruction-Generation Co-Training for Free-Viewpoint Driving Scene Synthesis

URL: http://arxiv.org/abs/2512.04830v1
Date: Thu, 04 Dec 2025 14:14:21 GMT
Title: FreeGen: Feed-Forward Reconstruction-Generation Co-Training for Free-Viewpoint Driving Scene Synthesis
Authors: Shijie Chen, Peixi Peng,
Abstract summary: FreeGen is a feed-forward reconstruction-generation co-training framework for free-viewpoint driving scene.<n>We show that FreeGen achieves state-of-the-art performance for free-viewpoint driving scene synthesis.
Score: 24.76850299847588
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Closed-loop simulation and scalable pre-training for autonomous driving require synthesizing free-viewpoint driving scenes. However, existing datasets and generative pipelines rarely provide consistent off-trajectory observations, limiting large-scale evaluation and training. While recent generative models demonstrate strong visual realism, they struggle to jointly achieve interpolation consistency and extrapolation realism without per-scene optimization. To address this, we propose FreeGen, a feed-forward reconstruction-generation co-training framework for free-viewpoint driving scene synthesis. The reconstruction model provides stable geometric representations to ensure interpolation consistency, while the generation model performs geometry-aware enhancement to improve realism at unseen viewpoints. Through co-training, generative priors are distilled into the reconstruction model to improve off-trajectory rendering, and the refined geometry in turn offers stronger structural guidance for generation. Experiments demonstrate that FreeGen achieves state-of-the-art performance for free-viewpoint driving scene synthesis.

Related papers

DiffusionHarmonizer: Bridging Neural Reconstruction and Photorealistic Simulation with Online Diffusion Enhancer [62.18680935878919]
We introduce DiffusionHarmonizer, an online generative enhancement framework that transforms renderings into temporally consistent outputs.<n>At its core is a single-step temporally-conditioned enhancer capable of running in online simulators on a single GPU.
arXiv Detail & Related papers (2026-02-27T15:35:30Z)
DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving [49.11389494068169]
We present DrivingGen, the first comprehensive benchmark for generative driving world models.<n>DrivingGen combines a diverse evaluation dataset curated from both driving datasets and internet-scale video sources.<n>General models look better but break physics, while driving-specific ones capture motion realistically but lag in visual quality.
arXiv Detail & Related papers (2026-01-04T13:36:21Z)
SymDrive: Realistic and Controllable Driving Simulator via Symmetric Auto-regressive Online Restoration [37.202523124756034]
Current approaches often falter in large-angle novel view synthesis and suffer from geometric or lighting artifacts during asset manipulation.<n>We propose SymDrive, a unified diffusion-based framework capable of joint high-quality rendering and scene editing.<n>We demonstrate that SymDrive achieves photorealistic state-of-the-art performance in both novel-view enhancement and realistic 3D vehicle insertion.
arXiv Detail & Related papers (2025-12-25T10:28:43Z)
Optimization-Guided Diffusion for Interactive Scene Generation [52.23368750264419]
We present OMEGA, an optimization-guided, training-free framework that enforces structural consistency and interaction awareness during diffusion-based sampling.<n>We show that OMEGA improves generation realism, consistency, and controllability, increasing the ratio of physically and behaviorally valid scenes.<n>Our approach can also generate $5times$ more near-collision frames with a time-to-collision under three seconds.
arXiv Detail & Related papers (2025-12-08T15:56:18Z)
HybridWorldSim: A Scalable and Controllable High-fidelity Simulator for Autonomous Driving [59.55918581964678]
HybridWorldSim is a hybrid simulation framework that integrates multi-traversal neural reconstruction for static backgrounds with generative modeling for dynamic agents.<n>We release a new multi-traversal dataset MIRROR that captures a wide range of routes and environmental conditions across different cities.
arXiv Detail & Related papers (2025-11-27T07:53:16Z)
Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method [54.461213497603154]
Occupancy-centric methods have recently achieved state-of-the-art results by offering consistent conditioning across frames and modalities.<n>Nuplan-Occ is the largest occupancy dataset to date, constructed from the widely used Nuplan benchmark.<n>We develop a unified framework that jointly synthesizes high-quality occupancy, multi-view videos, and LiDAR point clouds.
arXiv Detail & Related papers (2025-10-27T03:52:45Z)
ReCoM: Realistic Co-Speech Motion Generation with Recurrent Embedded Transformer [58.49950218437718]
We present ReCoM, an efficient framework for generating high-fidelity and generalizable human body motions synchronized with speech.<n>The core innovation lies in the Recurrent Embedded Transformer (RET), which integrates Dynamic Embedding Regularization (DER) into a Vision Transformer (ViT) core architecture.<n>To enhance model robustness, we incorporate the proposed DER strategy, which equips the model with dual capabilities of noise resistance and cross-domain generalization.
arXiv Detail & Related papers (2025-03-27T16:39:40Z)
Bench2Drive-R: Turning Real World Data into Reactive Closed-Loop Autonomous Driving Benchmark by Generative Model [63.336123527432136]
We introduce Bench2Drive-R, a generative framework that enables reactive closed-loop evaluation.<n>Unlike existing video generative models for autonomous driving, the proposed designs are tailored for interactive simulation.<n>We compare the generation quality of Bench2Drive-R with existing generative models and achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-12-11T06:35:18Z)
Driving View Synthesis on Free-form Trajectories with Generative Prior [39.24591650300784]
DriveX is a novel free-form driving view synthesis framework.<n>It distills generative prior into the 3D Gaussian model during its optimization.<n>It achieves high-quality view synthesis beyond recorded trajectories in real time.
arXiv Detail & Related papers (2024-12-02T17:07:53Z)
HarmonicNeRF: Geometry-Informed Synthetic View Augmentation for 3D Scene Reconstruction in Driving Scenarios [2.949710700293865]
HarmonicNeRF is a novel approach for outdoor self-supervised monocular scene reconstruction. It capitalizes on the strengths of NeRF and enhances surface reconstruction accuracy by augmenting the input space with geometry-informed synthetic views. Our approach establishes new benchmarks in synthesizing novel depth views and reconstructing scenes, significantly outperforming existing methods.
arXiv Detail & Related papers (2023-10-09T07:42:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.