Related papers: PreF3R: Pose-Free Feed-Forward 3D Gaussian Splatting from Variable-length Image Sequence

PreF3R: Pose-Free Feed-Forward 3D Gaussian Splatting from Variable-length Image Sequence

URL: http://arxiv.org/abs/2411.16877v1
Date: Mon, 25 Nov 2024 19:16:29 GMT
Title: PreF3R: Pose-Free Feed-Forward 3D Gaussian Splatting from Variable-length Image Sequence
Authors: Zequn Chen, Jiezhi Yang, Heng Yang,
Abstract summary: We present PreF3R, Pose-Free Feed-forward 3D Reconstruction from an image sequence of variable length. PreF3R removes the need for camera calibration and reconstructs the 3D Gaussian field within a canonical coordinate frame directly from a sequence of unposed images.
Score: 3.61512056914095
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present PreF3R, Pose-Free Feed-forward 3D Reconstruction from an image sequence of variable length. Unlike previous approaches, PreF3R removes the need for camera calibration and reconstructs the 3D Gaussian field within a canonical coordinate frame directly from a sequence of unposed images, enabling efficient novel-view rendering. We leverage DUSt3R's ability for pair-wise 3D structure reconstruction, and extend it to sequential multi-view input via a spatial memory network, eliminating the need for optimization-based global alignment. Additionally, PreF3R incorporates a dense Gaussian parameter prediction head, which enables subsequent novel-view synthesis with differentiable rasterization. This allows supervising our model with the combination of photometric loss and pointmap regression loss, enhancing both photorealism and structural accuracy. Given a sequence of ordered images, PreF3R incrementally reconstructs the 3D Gaussian field at 20 FPS, therefore enabling real-time novel-view rendering. Empirical experiments demonstrate that PreF3R is an effective solution for the challenging task of pose-free feed-forward novel-view synthesis, while also exhibiting robust generalization to unseen scenes.

Related papers

Off The Grid: Detection of Primitives for Feed-Forward 3D Gaussian Splatting [33.7339252839354]
We introduce a new feed-forward architecture that detects 3D Gaussian primitives at a sub-pixel level.<n>Inspired by keypoint detection, our decoder learns to distribute primitives across image patches.<n>Our resulting pose-free model generates scenes in seconds, achieving state-of-the-art novel view synthesis for feed-forward models.
arXiv Detail & Related papers (2025-12-17T14:59:21Z)
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction [59.77970844874235]
We present FreeSplatter, a feed-forward reconstruction framework capable of generating high-quality 3D Gaussians from sparse-view images. FreeSplatter is built upon a streamlined transformer architecture, comprising sequential self-attention blocks. We show FreeSplatter's potential in enhancing the productivity of downstream applications, such as text/image-to-3D content creation.
arXiv Detail & Related papers (2024-12-12T18:52:53Z)
USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting [45.246178004823534]
Spike cameras, as an innovative neuromorphic camera that captures scenes with the 0-1 bit stream at 40 kHz, are increasingly employed for the 3D reconstruction task. Previous spike-based 3D reconstruction approaches often employ a casecased pipeline. We propose a synergistic optimization framework, textbfUSP-Gaussian, that unifies spike-based image reconstruction, pose correction, and Gaussian splatting into an end-to-end framework.
arXiv Detail & Related papers (2024-11-15T14:15:16Z)
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images. Our model achieves real-time 3D Gaussian reconstruction during inference. This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z)
PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting [54.7468067660037]
PF3plat sets a new state-of-the-art across all benchmarks, supported by comprehensive ablation studies validating our design choices. Our framework capitalizes on fast speed, scalability, and high-quality 3D reconstruction and view synthesis capabilities of 3DGS.
arXiv Detail & Related papers (2024-10-29T15:28:15Z)
Gaussian Opacity Fields: Efficient Adaptive Surface Reconstruction in Unbounded Scenes [50.92217884840301]
Gaussian Opacity Fields (GOF) is a novel approach for efficient, high-quality, and adaptive surface reconstruction in scenes. GOF is derived from ray-tracing-based volume rendering of 3D Gaussians. GOF surpasses existing 3DGS-based methods in surface reconstruction and novel view synthesis.
arXiv Detail & Related papers (2024-04-16T17:57:19Z)
InstantSplat: Sparse-view Gaussian Splatting in Seconds [91.77050739918037]
We introduce InstantSplat, a novel approach for addressing sparse-view 3D scene reconstruction at lightning-fast speed. InstantSplat employs a self-supervised framework that optimize 3D scene representation and camera poses. It achieves an acceleration of over 30x in reconstruction and improves visual quality (SSIM) from 0.3755 to 0.7624 compared to traditional SfM with 3D-GS.
arXiv Detail & Related papers (2024-03-29T17:29:58Z)
CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians [18.42203035154126]
We introduce a structured Gaussian representation that can be controlled in 2D image space. We then constraint the Gaussians, in particular their position, and prevent them from moving independently during optimization. We demonstrate significant improvements compared to the state-of-the-art sparse-view NeRF-based approaches on a variety of scenes.
arXiv Detail & Related papers (2024-03-28T15:27:13Z)
GGRt: Towards Pose-free Generalizable 3D Gaussian Splatting in Real-time [112.32349668385635]
GGRt is a novel approach to generalizable novel view synthesis that alleviates the need for real camera poses. As the first pose-free generalizable 3D-GS framework, GGRt achieves inference at $ge$ 5 FPS and real-time rendering at $ge$ 100 FPS.
arXiv Detail & Related papers (2024-03-15T09:47:35Z)
pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction [26.72289913260324]
pixelSplat is a feed-forward model that learns to reconstruct 3D radiance fields parameterized by 3D Gaussian primitives from pairs of images. Our model features real-time and memory-efficient rendering for scalable training as well as fast 3D reconstruction at inference time.
arXiv Detail & Related papers (2023-12-19T17:03:50Z)
GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis [70.24111297192057]
We present a new approach, termed GPS-Gaussian, for synthesizing novel views of a character in a real-time manner. The proposed method enables 2K-resolution rendering under a sparse-view camera setting.
arXiv Detail & Related papers (2023-12-04T18:59:55Z)
High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views. Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.