Related papers: High-Fidelity Novel View Synthesis via Splatting-Guided Diffusion

High-Fidelity Novel View Synthesis via Splatting-Guided Diffusion

URL: http://arxiv.org/abs/2502.12752v1
Date: Tue, 18 Feb 2025 11:13:06 GMT
Title: High-Fidelity Novel View Synthesis via Splatting-Guided Diffusion
Authors: Xiang Zhang, Yang Zhang, Lukas Mehl, Markus Gross, Christopher Schroers,
Abstract summary: We introduce SplatDiff, a pixel-splatting-guided video diffusion model designed to synthesize high-fidelity novel views from a single image.<n>To mitigate texture hallucination, we design a texture bridge module that enables high-fidelity texture generation through adaptive feature fusion.<n>In experiments, SplatDiff shows remarkable zero-shot performance across diverse tasks, including sparse-view NVS and stereo video conversion.
Score: 15.244909728255417
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite recent advances in Novel View Synthesis (NVS), generating high-fidelity views from single or sparse observations remains a significant challenge. Existing splatting-based approaches often produce distorted geometry due to splatting errors. While diffusion-based methods leverage rich 3D priors to achieve improved geometry, they often suffer from texture hallucination. In this paper, we introduce SplatDiff, a pixel-splatting-guided video diffusion model designed to synthesize high-fidelity novel views from a single image. Specifically, we propose an aligned synthesis strategy for precise control of target viewpoints and geometry-consistent view synthesis. To mitigate texture hallucination, we design a texture bridge module that enables high-fidelity texture generation through adaptive feature fusion. In this manner, SplatDiff leverages the strengths of splatting and diffusion to generate novel views with consistent geometry and high-fidelity details. Extensive experiments verify the state-of-the-art performance of SplatDiff in single-view NVS. Additionally, without extra training, SplatDiff shows remarkable zero-shot performance across diverse tasks, including sparse-view NVS and stereo video conversion.

Related papers

One-Shot Refiner: Boosting Feed-forward Novel View Synthesis via One-Step Diffusion [57.824020826432815]
We present a novel framework for high-fidelity novel view synthesis (NVS) from sparse images.<n>We design a Dual-Domain Detail Perception Module, which enables handling high-resolution images without being limited by the ViT backbone.<n>We develop a feature-guided diffusion network, which can preserve high-frequency details during the restoration process.
arXiv Detail & Related papers (2026-01-20T17:11:55Z)
SplatBright: Generalizable Low-Light Scene Reconstruction from Sparse Views via Physically-Guided Gaussian Enhancement [26.905118897488077]
SplatBright is the first generalizable 3D Gaussian framework for joint low-light enhancement and reconstruction from sparse sRGB inputs.<n>Our key idea is to integrate physically guided illumination modeling with geometry-appearance decoupling for consistent low-light reconstruction.<n>Experiments on public and self-collected datasets demonstrate that SplatBright achieves superior novel view synthesis, cross-view consistency, and better generalization to unseen low-light scenes compared with both 2D and 3D methods.
arXiv Detail & Related papers (2025-12-21T09:06:16Z)
OracleGS: Grounding Generative Priors for Sparse-View Gaussian Splatting [78.70702961852119]
OracleGS reconciles generative completeness with regressive fidelity for sparse view Gaussian Splatting.<n>Our approach conditions the powerful generative prior on multi-view geometric evidence, filtering hallucinatory artifacts while preserving plausible completions in under-constrained regions.
arXiv Detail & Related papers (2025-09-27T11:19:32Z)
WAVE: Warp-Based View Guidance for Consistent Novel View Synthesis Using a Single Image [3.4248731707266264]
This paper proposes a novel view-consistent image generation method which utilizes diffusion models without additional modules.<n>Our key idea is to enhance diffusion models with a training-free method that enables adaptive attention manipulation and noise reinitialization.<n>Our method improves view consistency across various diffusion models, demonstrating its broader applicability.
arXiv Detail & Related papers (2025-06-30T05:00:47Z)
AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views [57.13066710710485]
AnySplat is a feed forward network for novel view synthesis from uncalibrated image collections.<n>A single forward pass yields a set of 3D Gaussian primitives encoding both scene geometry and appearance.<n>In extensive zero shot evaluations, AnySplat matches the quality of pose aware baselines in both sparse and dense view scenarios.
arXiv Detail & Related papers (2025-05-29T17:49:56Z)
Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis [25.924727931514735]
Generalizable 3DGS can reconstruct new scenes from sparse-view observations in a feed-forward inference manner. Existing methods rely heavily on epipolar priors, which can be unreliable in complex realworld scenes. We propose eFreeSplat, an efficient feed-forward 3DGS-based model for generalizable novel view synthesis.
arXiv Detail & Related papers (2024-10-30T08:51:29Z)
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis [63.169364481672915]
We propose textbfViewCrafter, a novel method for synthesizing high-fidelity novel views of generic scenes from single or sparse images. Our method takes advantage of the powerful generation capabilities of video diffusion model and the coarse 3D clues offered by point-based representation to generate high-quality video frames.
arXiv Detail & Related papers (2024-09-03T16:53:19Z)
MultiDiff: Consistent Novel View Synthesis from a Single Image [60.04215655745264]
MultiDiff is a novel approach for consistent novel view synthesis of scenes from a single RGB image. Our results demonstrate that MultiDiff outperforms state-of-the-art methods on the challenging, real-world datasets RealEstate10K and ScanNet.
arXiv Detail & Related papers (2024-06-26T17:53:51Z)
FreeSplat: Generalizable 3D Gaussian Splatting Towards Free-View Synthesis of Indoor Scenes [50.534213038479926]
FreeSplat is capable of reconstructing geometrically consistent 3D scenes from long sequence input towards free-view synthesis. We propose a simple but effective free-view training strategy that ensures robust view synthesis across broader view range regardless of the number of views.
arXiv Detail & Related papers (2024-05-28T08:40:14Z)
Consistent123: Improve Consistency for One Image to 3D Object Synthesis [74.1094516222327]
Large image diffusion models enable novel view synthesis with high quality and excellent zero-shot capability. These models have no guarantee of view consistency, limiting the performance for downstream tasks like 3D reconstruction and image-to-3D generation. We propose Consistent123 to synthesize novel views simultaneously by incorporating additional cross-view attention layers and the shared self-attention mechanism.
arXiv Detail & Related papers (2023-10-12T07:38:28Z)
Generative Novel View Synthesis with 3D-Aware Diffusion Models [96.78397108732233]
We present a diffusion-based model for 3D-aware generative novel view synthesis from as few as a single input image. Our method makes use of existing 2D diffusion backbones but, crucially, incorporates geometry priors in the form of a 3D feature volume. In addition to generating novel views, our method has the ability to autoregressively synthesize 3D-consistent sequences.
arXiv Detail & Related papers (2023-04-05T17:15:47Z)
ProbNVS: Fast Novel View Synthesis with Learned Probability-Guided Sampling [42.37704606186928]
We propose to build a novel view synthesis framework based on learned MVS priors. We show that our method achieves 15 to 40 times faster rendering compared to state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-07T14:45:42Z)
Stable View Synthesis [100.86844680362196]
We present Stable View Synthesis (SVS) Given a set of source images depicting a scene from freely distributed viewpoints, SVS synthesizes new views of the scene. SVS outperforms state-of-the-art view synthesis methods both quantitatively and qualitatively on three diverse real-world datasets.
arXiv Detail & Related papers (2020-11-14T07:24:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.