HorizonForge: Driving Scene Editing with Any Trajectories and Any Vehicles
- URL: http://arxiv.org/abs/2602.21333v2
- Date: Sun, 01 Mar 2026 00:13:21 GMT
- Title: HorizonForge: Driving Scene Editing with Any Trajectories and Any Vehicles
- Authors: Yifan Wang, Francesco Pittaluga, Zaid Tasneem, Chenyu You, Manmohan Chandraker, Ziyu Jiang,
- Abstract summary: Controllable driving scene generation is critical for realistic and scalable autonomous driving simulation.<n>We introduce HorizonForge, a unified framework that reconstructs scenes as editable Gaussian Splats and Meshes.<n>Experiments show that Gaussian-Mesh representation delivers substantially higher fidelity than alternative 3D representations.
- Score: 63.88996084630768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Controllable driving scene generation is critical for realistic and scalable autonomous driving simulation, yet existing approaches struggle to jointly achieve photorealism and precise control. We introduce HorizonForge, a unified framework that reconstructs scenes as editable Gaussian Splats and Meshes, enabling fine-grained 3D manipulation and language-driven vehicle insertion. Edits are rendered through a noise-aware video diffusion process that enforces spatial and temporal consistency, producing diverse scene variations in a single feed-forward pass without per-trajectory optimization. To standardize evaluation, we further propose HorizonSuite, a comprehensive benchmark spanning ego- and agent-level editing tasks such as trajectory modifications and object manipulation. Extensive experiments show that Gaussian-Mesh representation delivers substantially higher fidelity than alternative 3D representations, and that temporal priors from video diffusion are essential for coherent synthesis. Combining these findings, HorizonForge establishes a simple yet powerful paradigm for photorealistic, controllable driving simulation, achieving an 83.4% user-preference gain and a 25.19% FID improvement over the second best state-of-the-art method. Project page: https://horizonforge.github.io/ .
Related papers
- SymDrive: Realistic and Controllable Driving Simulator via Symmetric Auto-regressive Online Restoration [37.202523124756034]
Current approaches often falter in large-angle novel view synthesis and suffer from geometric or lighting artifacts during asset manipulation.<n>We propose SymDrive, a unified diffusion-based framework capable of joint high-quality rendering and scene editing.<n>We demonstrate that SymDrive achieves photorealistic state-of-the-art performance in both novel-view enhancement and realistic 3D vehicle insertion.
arXiv Detail & Related papers (2025-12-25T10:28:43Z) - DrivingGaussian++: Towards Realistic Reconstruction and Editable Simulation for Surrounding Dynamic Driving Scenes [49.23098808629567]
DrivingGaussian++ is an efficient framework for realistic reconstructing and controllable editing of autonomous driving scenes.<n>It supports training-free controllable editing for dynamic driving scenes, including texture modification, weather simulation, and object manipulation.<n>Our method can automatically generate dynamic object motion trajectories and enhance their realism during the optimization process.
arXiv Detail & Related papers (2025-08-28T16:22:54Z) - SceneCrafter: Controllable Multi-View Driving Scene Editing [44.91248700043744]
We propose SceneCrafter, a versatile editor for realistic 3D-consistent manipulation of driving scenes captured from multiple cameras.<n>SceneCrafter achieves state-of-the-art realism, controllability, 3D consistency, and scene editing quality compared to existing baselines.
arXiv Detail & Related papers (2025-06-24T10:23:47Z) - StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models [76.62929629864034]
We introduce StreetCrafter, a controllable video diffusion model that utilizes LiDAR point cloud renderings as pixel-level conditions.<n>In addition, the utilization of pixel-level LiDAR conditions allows us to make accurate pixel-level edits to target scenes.<n>Our model enables flexible control over viewpoint changes, enlarging the view for satisfying rendering regions.
arXiv Detail & Related papers (2024-12-17T18:58:55Z) - Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model [83.31688383891871]
We propose a Spatial-Temporal simulAtion for drivinG (Stag-1) model to reconstruct real-world scenes.<n>Stag-1 constructs continuous 4D point cloud scenes using surround-view data from autonomous vehicles.<n>It decouples spatial-temporal relationships and produces coherent driving videos.
arXiv Detail & Related papers (2024-12-06T18:59:56Z) - AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction [17.600027937450342]
AutoSplat is a framework employing Gaussian splatting to achieve highly realistic reconstructions of autonomous driving scenes.
Our method enables multi-view consistent simulation of challenging scenarios including lane changes.
arXiv Detail & Related papers (2024-07-02T18:36:50Z) - DragTraffic: Interactive and Controllable Traffic Scene Generation for Autonomous Driving [10.90477019946728]
DragTraffic is a general, interactive, and controllable traffic scene generation framework based on conditional diffusion.
We employ a regression model to provide a general initial solution and a refinement process based on the conditional diffusion model to ensure diversity.
Experiments on a real-world driving dataset show that DragTraffic outperforms existing methods in terms of authenticity, diversity, and freedom.
arXiv Detail & Related papers (2024-04-19T04:49:28Z) - DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes [57.12439406121721]
We present DrivingGaussian, an efficient and effective framework for surrounding dynamic autonomous driving scenes.
For complex scenes with moving objects, we first sequentially and progressively model the static background of the entire scene.
We then leverage a composite dynamic Gaussian graph to handle multiple moving objects.
We further use a LiDAR prior for Gaussian Splatting to reconstruct scenes with greater details and maintain panoramic consistency.
arXiv Detail & Related papers (2023-12-13T06:30:51Z) - SceneGen: Learning to Generate Realistic Traffic Scenes [92.98412203941912]
We present SceneGen, a neural autoregressive model of traffic scenes that eschews the need for rules and distributions.
We demonstrate SceneGen's ability to faithfully model distributions of real traffic scenes.
arXiv Detail & Related papers (2021-01-16T22:51:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.