ArmGS: Composite Gaussian Appearance Refinement for Modeling Dynamic Urban Environments
- URL: http://arxiv.org/abs/2507.03886v1
- Date: Sat, 05 Jul 2025 03:54:40 GMT
- Title: ArmGS: Composite Gaussian Appearance Refinement for Modeling Dynamic Urban Environments
- Authors: Guile Wu, Dongfeng Bai, Bingbing Liu,
- Abstract summary: This work focuses on modeling dynamic urban environments for autonomous driving simulation.<n>We propose a new approach named ArmGS that exploits composite driving Gaussian splatting with multi-granularity appearance refinement.<n>This not only models global scene appearance variations between frames and camera viewpoints, but also models local fine-grained photorealistic changes of background and objects.
- Score: 22.371417505012566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work focuses on modeling dynamic urban environments for autonomous driving simulation. Contemporary data-driven methods using neural radiance fields have achieved photorealistic driving scene modeling, but they suffer from low rendering efficacy. Recently, some approaches have explored 3D Gaussian splatting for modeling dynamic urban scenes, enabling high-fidelity reconstruction and real-time rendering. However, these approaches often neglect to model fine-grained variations between frames and camera viewpoints, leading to suboptimal results. In this work, we propose a new approach named ArmGS that exploits composite driving Gaussian splatting with multi-granularity appearance refinement for autonomous driving scene modeling. The core idea of our approach is devising a multi-level appearance modeling scheme to optimize a set of transformation parameters for composite Gaussian refinement from multiple granularities, ranging from local Gaussian level to global image level and dynamic actor level. This not only models global scene appearance variations between frames and camera viewpoints, but also models local fine-grained changes of background and objects. Extensive experiments on multiple challenging autonomous driving datasets, namely, Waymo, KITTI, NOTR and VKITTI2, demonstrate the superiority of our approach over the state-of-the-art methods.
Related papers
- FreeDriveRF: Monocular RGB Dynamic NeRF without Poses for Autonomous Driving via Point-Level Dynamic-Static Decoupling [13.495102292705253]
FreeDriveRF reconstructs dynamic driving scenes using only sequential RGB images without requiring poses inputs.<n>We introduce a warped ray-guided dynamic object rendering consistency loss, utilizing optical flow to better constrain the dynamic modeling process.
arXiv Detail & Related papers (2025-05-14T14:02:49Z) - Hybrid Rendering for Multimodal Autonomous Driving: Merging Neural and Physics-Based Simulation [1.0027737736304287]
We introduce a hybrid approach that combines the strengths of neural reconstruction with physics-based rendering.<n>Our approach significantly enhances novel view synthesis quality, especially for road surfaces and lane markings.<n>We achieve this by training a customized NeRF model on the original images with depth regularization derived from a noisy LiDAR point cloud.
arXiv Detail & Related papers (2025-03-12T15:18:50Z) - Pre-Trained Video Generative Models as World Simulators [59.546627730477454]
We propose Dynamic World Simulation (DWS) to transform pre-trained video generative models into controllable world simulators.<n>To achieve precise alignment between conditioned actions and generated visual changes, we introduce a lightweight, universal action-conditioned module.<n> Experiments demonstrate that DWS can be versatilely applied to both diffusion and autoregressive transformer models.
arXiv Detail & Related papers (2025-02-10T14:49:09Z) - OmniRe: Omni Urban Scene Reconstruction [78.99262488964423]
We introduce OmniRe, a comprehensive system for creating high-fidelity digital twins of dynamic real-world scenes from on-device logs.<n>Our approach builds scene graphs on 3DGS and constructs multiple Gaussian representations in canonical spaces that model various dynamic actors.
arXiv Detail & Related papers (2024-08-29T17:56:33Z) - Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling [10.914612535745789]
This paper introduces Motion-oriented Compositional Neural Radiance Fields (MoCo-NeRF)
MoCo-NeRF is a framework designed to perform free-viewpoint rendering of monocular human videos.
arXiv Detail & Related papers (2024-07-16T17:59:01Z) - AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction [17.600027937450342]
AutoSplat is a framework employing Gaussian splatting to achieve highly realistic reconstructions of autonomous driving scenes.
Our method enables multi-view consistent simulation of challenging scenarios including lane changes.
arXiv Detail & Related papers (2024-07-02T18:36:50Z) - Dynamic 3D Gaussian Fields for Urban Areas [60.64840836584623]
We present an efficient neural 3D scene representation for novel-view synthesis (NVS) in large-scale, dynamic urban areas.
We propose 4DGF, a neural scene representation that scales to large-scale dynamic urban areas.
arXiv Detail & Related papers (2024-06-05T12:07:39Z) - DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes [57.12439406121721]
We present DrivingGaussian, an efficient and effective framework for surrounding dynamic autonomous driving scenes.
For complex scenes with moving objects, we first sequentially and progressively model the static background of the entire scene.
We then leverage a composite dynamic Gaussian graph to handle multiple moving objects.
We further use a LiDAR prior for Gaussian Splatting to reconstruct scenes with greater details and maintain panoramic consistency.
arXiv Detail & Related papers (2023-12-13T06:30:51Z) - Multi-Object Manipulation via Object-Centric Neural Scattering Functions [40.45919680959231]
We propose using object-centric neural scattering functions (OSFs) as object representations in a model-predictive control framework.
OSFs model per-object light transport, enabling compositional scene re-rendering under object rearrangement and varying lighting conditions.
arXiv Detail & Related papers (2023-06-14T21:14:10Z) - DynIBaR: Neural Dynamic Image-Based Rendering [79.44655794967741]
We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene.
We adopt a volumetric image-based rendering framework that synthesizes new viewpoints by aggregating features from nearby views.
We demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets.
arXiv Detail & Related papers (2022-11-20T20:57:02Z) - MoCo-Flow: Neural Motion Consensus Flow for Dynamic Humans in Stationary
Monocular Cameras [98.40768911788854]
We introduce MoCo-Flow, a representation that models the dynamic scene using a 4D continuous time-variant function.
At the heart of our work lies a novel optimization formulation, which is constrained by a motion consensus regularization on the motion flow.
We extensively evaluate MoCo-Flow on several datasets that contain human motions of varying complexity.
arXiv Detail & Related papers (2021-06-08T16:03:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.