Explicit Correspondence Matching for Generalizable Neural Radiance
Fields
- URL: http://arxiv.org/abs/2304.12294v1
- Date: Mon, 24 Apr 2023 17:46:01 GMT
- Title: Explicit Correspondence Matching for Generalizable Neural Radiance
Fields
- Authors: Yuedong Chen, Haofei Xu, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham,
Jianfei Cai
- Abstract summary: We present a new NeRF method that is able to generalize to new unseen scenarios and perform novel view synthesis with as few as two source views.
The explicit correspondence matching is quantified with the cosine similarity between image features sampled at the 2D projections of a 3D point on different views.
Our method achieves state-of-the-art results on different evaluation settings, with the experiments showing a strong correlation between our learned cosine feature similarity and volume density.
- Score: 49.49773108695526
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present a new generalizable NeRF method that is able to directly
generalize to new unseen scenarios and perform novel view synthesis with as few
as two source views. The key to our approach lies in the explicitly modeled
correspondence matching information, so as to provide the geometry prior to the
prediction of NeRF color and density for volume rendering. The explicit
correspondence matching is quantified with the cosine similarity between image
features sampled at the 2D projections of a 3D point on different views, which
is able to provide reliable cues about the surface geometry. Unlike previous
methods where image features are extracted independently for each view, we
consider modeling the cross-view interactions via Transformer cross-attention,
which greatly improves the feature matching quality. Our method achieves
state-of-the-art results on different evaluation settings, with the experiments
showing a strong correlation between our learned cosine feature similarity and
volume density, demonstrating the effectiveness and superiority of our proposed
method. Code is at https://github.com/donydchen/matchnerf
Related papers
- Robust Scene Change Detection Using Visual Foundation Models and Cross-Attention Mechanisms [27.882122236282054]
We present a novel method for scene change detection that leverages the robust feature extraction capabilities of a visual foundational model, DINOv2.
We evaluate our approach on two benchmark datasets, VL-CMU-CD and PSCD, along with their viewpoint-varied versions.
Our experiments demonstrate significant improvements in F1-score, particularly in scenarios involving geometric changes between image pairs.
arXiv Detail & Related papers (2024-09-25T11:55:27Z) - HawkI: Homography & Mutual Information Guidance for 3D-free Single Image to Aerial View [67.8213192993001]
We present HawkI, for synthesizing aerial-view images from text and an exemplar image.
HawkI blends the visual features from the input image within a pretrained text-to-2Dimage stable diffusion model.
At inference, HawkI employs a unique mutual information guidance formulation to steer the generated image towards faithfully replicating the semantic details of the input-image.
arXiv Detail & Related papers (2023-11-27T01:41:25Z) - LFM-3D: Learnable Feature Matching Across Wide Baselines Using 3D
Signals [9.201550006194994]
Learnable matchers often underperform when there exists only small regions of co-visibility between image pairs.
We propose LFM-3D, a Learnable Feature Matching framework that uses models based on graph neural networks.
We show that the resulting improved correspondences lead to much higher relative posing accuracy for in-the-wild image pairs.
arXiv Detail & Related papers (2023-03-22T17:46:27Z) - Geometry-biased Transformers for Novel View Synthesis [36.11342728319563]
We tackle the task of synthesizing novel views of an object given a few input images and associated camera viewpoints.
Our work is inspired by recent 'geometry-free' approaches where multi-view images are encoded as a (global) set-latent representation.
We propose 'Geometry-biased Transformers' (GBTs) that incorporate geometric inductive biases in the set-latent representation-based inference.
arXiv Detail & Related papers (2023-01-11T18:59:56Z) - Unifying Flow, Stereo and Depth Estimation [121.54066319299261]
We present a unified formulation and model for three motion and 3D perception tasks.
We formulate all three tasks as a unified dense correspondence matching problem.
Our model naturally enables cross-task transfer since the model architecture and parameters are shared across tasks.
arXiv Detail & Related papers (2022-11-10T18:59:54Z) - Coordinates Are NOT Lonely -- Codebook Prior Helps Implicit Neural 3D
Representations [29.756718435405983]
Implicit neural 3D representation has achieved impressive results in surface or scene reconstruction and novel view synthesis.
Existing approaches, such as Neural Radiance Field (NeRF) and its variants, usually require dense input views.
We introduce a novel coordinate-based model, CoCo-INR, for implicit neural 3D representation.
arXiv Detail & Related papers (2022-10-20T11:13:50Z) - Out of Sight, Out of Mind: A Source-View-Wise Feature Aggregation for
Multi-View Image-Based Rendering [26.866141260616793]
We propose a source-view-wise feature aggregation method, which facilitates us to find out the consensus in a robust way.
We validate the proposed method on various benchmark datasets, including synthetic and real image scenes.
arXiv Detail & Related papers (2022-06-10T07:06:05Z) - A Model for Multi-View Residual Covariances based on Perspective
Deformation [88.21738020902411]
We derive a model for the covariance of the visual residuals in multi-view SfM, odometry and SLAM setups.
We validate our model with synthetic and real data and integrate it into photometric and feature-based Bundle Adjustment.
arXiv Detail & Related papers (2022-02-01T21:21:56Z) - Extracting Triangular 3D Models, Materials, and Lighting From Images [59.33666140713829]
We present an efficient method for joint optimization of materials and lighting from multi-view image observations.
We leverage meshes with spatially-varying materials and environment that can be deployed in any traditional graphics engine.
arXiv Detail & Related papers (2021-11-24T13:58:20Z) - GOCor: Bringing Globally Optimized Correspondence Volumes into Your
Neural Network [176.3781969089004]
Feature correlation layer serves as a key neural network module in computer vision problems that involve dense correspondences between image pairs.
We propose GOCor, a fully differentiable dense matching module, acting as a direct replacement to the feature correlation layer.
Our approach significantly outperforms the feature correlation layer for the tasks of geometric matching, optical flow, and dense semantic matching.
arXiv Detail & Related papers (2020-09-16T17:33:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.