End2End Multi-View Feature Matching with Differentiable Pose
Optimization
- URL: http://arxiv.org/abs/2205.01694v3
- Date: Mon, 11 Sep 2023 10:06:19 GMT
- Title: End2End Multi-View Feature Matching with Differentiable Pose
Optimization
- Authors: Barbara Roessle and Matthias Nie{\ss}ner
- Abstract summary: We propose a graph attention network to predict image correspondences along with confidence weights.
The resulting matches serve as weighted constraints in a differentiable pose estimation.
We integrate information from multiple views by spanning the graph across multiple frames to predict the matches all at once.
- Score: 2.311583680973075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Erroneous feature matches have severe impact on subsequent camera pose
estimation and often require additional, time-costly measures, like RANSAC, for
outlier rejection. Our method tackles this challenge by addressing feature
matching and pose optimization jointly. To this end, we propose a graph
attention network to predict image correspondences along with confidence
weights. The resulting matches serve as weighted constraints in a
differentiable pose estimation. Training feature matching with gradients from
pose optimization naturally learns to down-weight outliers and boosts pose
estimation on image pairs compared to SuperGlue by 6.7% on ScanNet. At the same
time, it reduces the pose estimation time by over 50% and renders RANSAC
iterations unnecessary. Moreover, we integrate information from multiple views
by spanning the graph across multiple frames to predict the matches all at
once. Multi-view matching combined with end-to-end training improves the pose
estimation metrics on Matterport3D by 18.5% compared to SuperGlue.
Related papers
- One Diffusion to Generate Them All [54.82732533013014]
OneDiffusion is a versatile, large-scale diffusion model that supports bidirectional image synthesis and understanding.
It enables conditional generation from inputs such as text, depth, pose, layout, and semantic maps.
OneDiffusion allows for multi-view generation, camera pose estimation, and instant personalization using sequential image inputs.
arXiv Detail & Related papers (2024-11-25T12:11:05Z) - PRISM: PRogressive dependency maxImization for Scale-invariant image Matching [4.9521269535586185]
We propose PRogressive dependency maxImization for Scale-invariant image Matching (PRISM)
Our method's superior matching performance and generalization capability are confirmed by leading accuracy across various evaluation benchmarks and downstream tasks.
arXiv Detail & Related papers (2024-08-07T07:35:17Z) - DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses [59.51874686414509]
Current approaches approximate the continuous pose representation with a large number of discrete pose hypotheses.
We present a Deep Voxel Matching Network (DVMNet) that eliminates the need for pose hypotheses and computes the relative object pose in a single pass.
Our method delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z) - PRAGO: Differentiable Multi-View Pose Optimization From Objectness Detections [19.211193336526346]
We propose a Pose-refined Rotation Averaging Graph Optimization (PRAGO) method for differentiable estimating camera poses from a set of images.
Our method reconstructs the rotational pose, and in turn, the absolute pose, in a differentiable manner benefiting from the optimization of a sequence of geometrical tasks.
We show that PRAGO is able to outperform non-differentiable solvers on small and sparse scenes extracted from 7-Scenes achieving a relative improvement of 21% for rotations while achieving similar translation estimates.
arXiv Detail & Related papers (2024-03-13T14:42:55Z) - AffineGlue: Joint Matching and Robust Estimation [74.04609046690913]
We propose AffineGlue, a method for joint two-view feature matching and robust estimation.
AffineGlue selects potential matches from one-to-many correspondences to estimate minimal models.
Guided matching is then used to find matches consistent with the model, suffering less from the ambiguities of one-to-one matches.
arXiv Detail & Related papers (2023-07-28T08:05:36Z) - IMP: Iterative Matching and Pose Estimation with Adaptive Pooling [34.36397639248686]
We propose an textbfefficient IMP, called EIMP, to dynamically discard keypoints without potential matches.
Experiments on YFCC100m, Scannet, and Aachen Day-Night datasets demonstrate that the proposed method outperforms previous approaches in terms of accuracy and efficiency.
arXiv Detail & Related papers (2023-04-28T13:25:50Z) - PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching [51.142988196855484]
We propose PoseMatcher, an accurate model free one-shot object pose estimator.
We create a new training pipeline for object to image matching based on a three-view system.
To enable PoseMatcher to attend to distinct input modalities, an image and a pointcloud, we introduce IO-Layer.
arXiv Detail & Related papers (2023-04-03T21:14:59Z) - Learning Dynamics via Graph Neural Networks for Human Pose Estimation
and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame.
Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information.
Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z) - Self-supervised Keypoint Correspondences for Multi-Person Pose
Estimation and Tracking in Videos [32.43899916477434]
We propose an approach that relies on keypoint correspondences for associating persons in videos.
Instead of training the network for estimating keypoint correspondences on video data, it is trained on a large scale image datasets for human pose estimation.
Our approach achieves state-of-the-art results for multi-frame pose estimation and multi-person pose tracking on the PosTrack $2017$ and PoseTrack $2018$ data sets.
arXiv Detail & Related papers (2020-04-27T09:02:24Z) - Improving Few-shot Learning by Spatially-aware Matching and
CrossTransformer [116.46533207849619]
We study the impact of scale and location mismatch in the few-shot learning scenario.
We propose a novel Spatially-aware Matching scheme to effectively perform matching across multiple scales and locations.
arXiv Detail & Related papers (2020-01-06T14:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.