Image Matching across Wide Baselines: From Paper to Practice
- URL: http://arxiv.org/abs/2003.01587v5
- Date: Thu, 11 Feb 2021 13:50:17 GMT
- Title: Image Matching across Wide Baselines: From Paper to Practice
- Authors: Yuhe Jin and Dmytro Mishkin and Anastasiia Mishchuk and Jiri Matas and
Pascal Fua and Kwang Moo Yi and Eduard Trulls
- Abstract summary: We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
- Score: 80.9424750998559
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a comprehensive benchmark for local features and robust
estimation algorithms, focusing on the downstream task -- the accuracy of the
reconstructed camera pose -- as our primary metric. Our pipeline's modular
structure allows easy integration, configuration, and combination of different
methods and heuristics. This is demonstrated by embedding dozens of popular
algorithms and evaluating them, from seminal works to the cutting edge of
machine learning research. We show that with proper settings, classical
solutions may still outperform the perceived state of the art.
Besides establishing the actual state of the art, the conducted experiments
reveal unexpected properties of Structure from Motion (SfM) pipelines that can
help improve their performance, for both algorithmic and learned methods. Data
and code are online https://github.com/vcg-uvic/image-matching-benchmark,
providing an easy-to-use and flexible framework for the benchmarking of local
features and robust estimation methods, both alongside and against
top-performing methods. This work provides a basis for the Image Matching
Challenge https://vision.uvic.ca/image-matching-challenge.
Related papers
- Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - TopicFM: Robust and Interpretable Feature Matching with Topic-assisted [8.314830611853168]
We propose an architecture for image matching which is efficient, robust, and interpretable.
We introduce a novel feature matching module called TopicFM which can roughly organize same spatial structure across images into a topic.
Our method can only perform matching in co-visibility regions to reduce computations.
arXiv Detail & Related papers (2022-07-01T10:39:14Z) - A modular software framework for the design and implementation of
ptychography algorithms [55.41644538483948]
We present SciCom, a new ptychography software framework aiming at simulating ptychography datasets and testing state-of-the-art reconstruction algorithms.
Despite its simplicity, the software leverages accelerated processing through the PyTorch interface.
Results are shown on both synthetic and real datasets.
arXiv Detail & Related papers (2022-05-06T16:32:37Z) - Towards a Unified Approach to Homography Estimation Using Image Features
and Pixel Intensities [0.0]
The homography matrix is a key component in various vision-based robotic tasks.
Traditionally, homography estimation algorithms are classified into feature- or intensity-based.
This paper proposes a new hybrid method that unifies both classes into a single nonlinear optimization procedure.
arXiv Detail & Related papers (2022-02-20T02:47:05Z) - Summarize and Search: Learning Consensus-aware Dynamic Convolution for
Co-Saliency Detection [139.10628924049476]
Humans perform co-saliency detection by first summarizing the consensus knowledge in the whole group and then searching corresponding objects in each image.
Previous methods usually lack robustness, scalability, or stability for the first process and simply fuse consensus features with image features for the second process.
We propose a novel consensus-aware dynamic convolution model to explicitly and effectively perform the "summarize and search" process.
arXiv Detail & Related papers (2021-10-01T12:06:42Z) - Video Frame Interpolation via Structure-Motion based Iterative Fusion [19.499969588931414]
We propose a structure-motion based iterative fusion method for video frame Interpolation.
Inspired by the observation that audiences have different visual preferences on foreground and background objects, we for the first time propose to use saliency masks in the evaluation processes of the task of video frame Interpolation.
arXiv Detail & Related papers (2021-05-11T22:11:17Z) - How Fine-Tuning Allows for Effective Meta-Learning [50.17896588738377]
We present a theoretical framework for analyzing representations derived from a MAML-like algorithm.
We provide risk bounds on the best predictor found by fine-tuning via gradient descent, demonstrating that the algorithm can provably leverage the shared structure.
This separation result underscores the benefit of fine-tuning-based methods, such as MAML, over methods with "frozen representation" objectives in few-shot learning.
arXiv Detail & Related papers (2021-05-05T17:56:00Z) - Deep Keypoint-Based Camera Pose Estimation with Geometric Constraints [80.60538408386016]
Estimating relative camera poses from consecutive frames is a fundamental problem in visual odometry.
We propose an end-to-end trainable framework consisting of learnable modules for detection, feature extraction, matching and outlier rejection.
arXiv Detail & Related papers (2020-07-29T21:41:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.