Pose-Free Generalizable Rendering Transformer
- URL: http://arxiv.org/abs/2310.03704v3
- Date: Wed, 27 Dec 2023 22:42:04 GMT
- Title: Pose-Free Generalizable Rendering Transformer
- Authors: Zhiwen Fan, Panwang Pan, Peihao Wang, Yifan Jiang, Hanwen Jiang, Dejia
Xu, Zehao Zhu, Dilin Wang, Zhangyang Wang
- Abstract summary: PF-GRT is a Pose-Free framework for Generalizable Rendering Transformer.
PF-GRT is parameterized using a local relative coordinate system.
Experiments with zero-shot rendering on datasets reveal that it produces superior quality in generating photo-realistic images.
- Score: 72.47072706742065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the field of novel-view synthesis, the necessity of knowing camera poses
(e.g., via Structure from Motion) before rendering has been a common practice.
However, the consistent acquisition of accurate camera poses remains elusive,
and errors in pose extraction can adversely impact the view synthesis process.
To address this challenge, we introduce PF-GRT, a new Pose-Free framework for
Generalizable Rendering Transformer, eliminating the need for pre-computed
camera poses and instead leveraging feature-matching learned directly from
data. PF-GRT is parameterized using a local relative coordinate system, where
one of the source images is set as the origin. An OmniView Transformer is
designed for fusing multi-view cues under the pose-free setting, where
unposed-view fusion and origin-centric aggregation are performed. The 3D point
feature along target ray is sampled by projecting onto the selected origin
plane. The final pixel intensities are modulated and decoded using another
Transformer. PF-GRT demonstrates an impressive ability to generalize to new
scenes that were not encountered during the training phase, without the need of
pre-computing camera poses. Our experiments with zero-shot rendering on the
LLFF, RealEstate-10k, Shiny, and Blender datasets reveal that it produces
superior quality in generating photo-realistic images. Moreover, it
demonstrates robustness against noise in test camera poses. Code is available
at https://zhiwenfan.github.io/PF-GRT/.
Related papers
- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting [76.02450110026747]
Event cameras, inspired by biological vision, record pixel-wise intensity changes asynchronously with high temporal resolution.
We propose Event-Aided Free-Trajectory 3DGS, which seamlessly integrates the advantages of event cameras into 3DGS.
We evaluate our method on the public Tanks and Temples benchmark and a newly collected real-world dataset, RealEv-DAVIS.
arXiv Detail & Related papers (2024-10-20T13:44:24Z) - COLMAP-Free 3D Gaussian Splatting [88.420322646756]
We propose a novel method to perform novel view synthesis without any SfM preprocessing.
We process the input frames in a sequential manner and progressively grow the 3D Gaussians set by taking one input frame at a time.
Our method significantly improves over previous approaches in view synthesis and camera pose estimation under large motion changes.
arXiv Detail & Related papers (2023-12-12T18:39:52Z) - FlowCam: Training Generalizable 3D Radiance Fields without Camera Poses
via Pixel-Aligned Scene Flow [26.528667940013598]
Reconstruction of 3D neural fields from posed images has emerged as a promising method for self-supervised representation learning.
Key challenge preventing the deployment of these 3D scene learners on large-scale video data is their dependence on precise camera poses from structure-from-motion.
We propose a method that jointly reconstructs camera poses and 3D neural scene representations online and in a single forward pass.
arXiv Detail & Related papers (2023-05-31T20:58:46Z) - RePAST: Relative Pose Attention Scene Representation Transformer [78.33038881681018]
Scene Representation Transformer (SRT) is a recent method to render novel views at interactive rates.
We propose Relative Pose Attention SRT (RePAST): Instead of fixing a reference frame at the input, we inject pairwise relative camera pose information directly into the attention mechanism of the Transformers.
arXiv Detail & Related papers (2023-04-03T13:13:12Z) - Shape, Pose, and Appearance from a Single Image via Bootstrapped
Radiance Field Inversion [54.151979979158085]
We introduce a principled end-to-end reconstruction framework for natural images, where accurate ground-truth poses are not available.
We leverage an unconditional 3D-aware generator, to which we apply a hybrid inversion scheme where a model produces a first guess of the solution.
Our framework can de-render an image in as few as 10 steps, enabling its use in practical scenarios.
arXiv Detail & Related papers (2022-11-21T17:42:42Z) - Structure-Aware NeRF without Posed Camera via Epipolar Constraint [8.115535686311249]
The neural radiance field (NeRF) for realistic novel view synthesis requires camera poses to be pre-acquired.
We integrate the pose extraction and view synthesis into a single end-to-end procedure so they can benefit from each other.
arXiv Detail & Related papers (2022-10-01T03:57:39Z) - FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera
Manifold [5.462226912969161]
Current Generative Adversarial Networks (GANs) produce photorealistic renderings of portrait images.
We show how our approach enables the integration of a pre-trained StyleGAN into standard 3D rendering pipelines.
Our solution proposes the first truly free-viewpoint rendering of realistic faces at interactive rates.
arXiv Detail & Related papers (2021-09-20T08:59:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.