LFM-3D: Learnable Feature Matching Across Wide Baselines Using 3D
Signals
- URL: http://arxiv.org/abs/2303.12779v3
- Date: Tue, 30 Jan 2024 18:07:12 GMT
- Title: LFM-3D: Learnable Feature Matching Across Wide Baselines Using 3D
Signals
- Authors: Arjun Karpur, Guilherme Perrotta, Ricardo Martin-Brualla, Howard Zhou,
Andr\'e Araujo
- Abstract summary: Learnable matchers often underperform when there exists only small regions of co-visibility between image pairs.
We propose LFM-3D, a Learnable Feature Matching framework that uses models based on graph neural networks.
We show that the resulting improved correspondences lead to much higher relative posing accuracy for in-the-wild image pairs.
- Score: 9.201550006194994
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Finding localized correspondences across different images of the same object
is crucial to understand its geometry. In recent years, this problem has seen
remarkable progress with the advent of deep learning-based local image features
and learnable matchers. Still, learnable matchers often underperform when there
exists only small regions of co-visibility between image pairs (i.e. wide
camera baselines). To address this problem, we leverage recent progress in
coarse single-view geometry estimation methods. We propose LFM-3D, a Learnable
Feature Matching framework that uses models based on graph neural networks and
enhances their capabilities by integrating noisy, estimated 3D signals to boost
correspondence estimation. When integrating 3D signals into the matcher model,
we show that a suitable positional encoding is critical to effectively make use
of the low-dimensional 3D information. We experiment with two different 3D
signals - normalized object coordinates and monocular depth estimates - and
evaluate our method on large-scale (synthetic and real) datasets containing
object-centric image pairs across wide baselines. We observe strong feature
matching improvements compared to 2D-only methods, with up to +6% total recall
and +28% precision at fixed recall. Additionally, we demonstrate that the
resulting improved correspondences lead to much higher relative posing accuracy
for in-the-wild image pairs - up to 8.6% compared to the 2D-only approach.
Related papers
- Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling [14.341099905684844]
This paper investigates a 2D to 3D image translation method with a straightforward technique, enabling correlated 2D X-ray to 3D CT-like reconstruction.
We observe that existing approaches, which integrate information across multiple 2D views in the latent space lose valuable signal information during latent encoding. Instead, we simply repeat and the 2D views into higher-channel 3D volumes and approach the 3D reconstruction challenge as a straightforward 3D to 3D generative modeling problem.
This method enables the reconstructed 3D volume to retain valuable information from the 2D inputs, which are passed between channel states in a Swin U
arXiv Detail & Related papers (2024-06-26T15:18:20Z) - Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration [107.61458720202984]
This paper introduces a novel self-supervised learning framework for enhancing 3D perception in autonomous driving scenes.
We propose the learnable transformation alignment to bridge the domain gap between image and point cloud data.
We establish dense 2D-3D correspondences to estimate the rigid pose.
arXiv Detail & Related papers (2024-01-23T02:41:06Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale
Visual Localization [44.05930316729542]
We propose EP2P-Loc, a novel large-scale visual localization method for 3D point clouds.
To increase the number of inliers, we propose a simple algorithm to remove invisible 3D points in the image.
For the first time in this task, we employ a differentiable for end-to-end training.
arXiv Detail & Related papers (2023-09-14T07:06:36Z) - CheckerPose: Progressive Dense Keypoint Localization for Object Pose
Estimation with Graph Neural Network [66.24726878647543]
Estimating the 6-DoF pose of a rigid object from a single RGB image is a crucial yet challenging task.
Recent studies have shown the great potential of dense correspondence-based solutions.
We propose a novel pose estimation algorithm named CheckerPose, which improves on three main aspects.
arXiv Detail & Related papers (2023-03-29T17:30:53Z) - Improving Feature-based Visual Localization by Geometry-Aided Matching [21.1967752160412]
We introduce a novel 2D-3D matching method, Geometry-Aided Matching (GAM), which uses both appearance information and geometric context to improve 2D-3D feature matching.
GAM can greatly strengthen the recall of 2D-3D matches while maintaining high precision.
Our proposed localization method achieves state-of-the-art results on multiple visual localization datasets.
arXiv Detail & Related papers (2022-11-16T07:02:12Z) - Learning Stereopsis from Geometric Synthesis for 6D Object Pose
Estimation [11.999630902627864]
Current monocular-based 6D object pose estimation methods generally achieve less competitive results than RGBD-based methods.
This paper proposes a 3D geometric volume based pose estimation method with a short baseline two-view setting.
Experiments show that our method outperforms state-of-the-art monocular-based methods, and is robust in different objects and scenes.
arXiv Detail & Related papers (2021-09-25T02:55:05Z) - Soft Expectation and Deep Maximization for Image Feature Detection [68.8204255655161]
We propose SEDM, an iterative semi-supervised learning process that flips the question and first looks for repeatable 3D points, then trains a detector to localize them in image space.
Our results show that this new model trained using SEDM is able to better localize the underlying 3D points in a scene.
arXiv Detail & Related papers (2021-04-21T00:35:32Z) - Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point
Problem [98.92148855291363]
This paper proposes a deep CNN model which simultaneously solves for both 6-DoF absolute camera pose 2D--3D correspondences.
Tests on both real and simulated data have shown that our method substantially outperforms existing approaches.
arXiv Detail & Related papers (2020-03-15T04:17:30Z) - ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object
Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection.
The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.
To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.