Pixel-Perfect Structure-from-Motion with Featuremetric Refinement
- URL: http://arxiv.org/abs/2108.08291v1
- Date: Wed, 18 Aug 2021 17:58:55 GMT
- Title: Pixel-Perfect Structure-from-Motion with Featuremetric Refinement
- Authors: Philipp Lindenberger, Paul-Edouard Sarlin, Viktor Larsson, Marc
Pollefeys
- Abstract summary: We refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views.
This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors.
Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.
- Score: 96.73365545609191
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Finding local features that are repeatable across multiple views is a
cornerstone of sparse 3D reconstruction. The classical image matching paradigm
detects keypoints per-image once and for all, which can yield poorly-localized
features and propagate large errors to the final geometry. In this paper, we
refine two key steps of structure-from-motion by a direct alignment of
low-level image information from multiple views: we first adjust the initial
keypoint locations prior to any geometric estimation, and subsequently refine
points and camera poses as a post-processing. This refinement is robust to
large detection noise and appearance changes, as it optimizes a featuremetric
error based on dense features predicted by a neural network. This significantly
improves the accuracy of camera poses and scene geometry for a wide range of
keypoint detectors, challenging viewing conditions, and off-the-shelf deep
features. Our system easily scales to large image collections, enabling
pixel-perfect crowd-sourced localization at scale. Our code is publicly
available at https://github.com/cvg/pixel-perfect-sfm as an add-on to the
popular SfM software COLMAP.
Related papers
- Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer [21.832249148699397]
We address the task of estimating camera parameters from a set of images depicting a scene.
We show that scene coordinate regression, a learning-based relocalization approach, allows us to build implicit, neural scene representations from unposed images.
arXiv Detail & Related papers (2024-04-22T17:02:33Z) - Visual Geometry Grounded Deep Structure From Motion [20.203320509695306]
We propose a new deep pipeline VGGSfM, where each component is fully differentiable and can be trained in an end-to-end manner.
First, we build on recent advances in deep 2D point tracking to extract reliable pixel-accurate tracks, which eliminates the need for chaining pairwise matches.
We attain state-of-the-art performance on three popular datasets, CO3D, IMC Phototourism, and ETH3D.
arXiv Detail & Related papers (2023-12-07T18:59:52Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - Soft Expectation and Deep Maximization for Image Feature Detection [68.8204255655161]
We propose SEDM, an iterative semi-supervised learning process that flips the question and first looks for repeatable 3D points, then trains a detector to localize them in image space.
Our results show that this new model trained using SEDM is able to better localize the underlying 3D points in a scene.
arXiv Detail & Related papers (2021-04-21T00:35:32Z) - Back to the Feature: Learning Robust Camera Localization from Pixels to
Pose [114.89389528198738]
We introduce PixLoc, a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model.
The system can localize in large environments given coarse pose priors but also improve the accuracy of sparse feature matching.
arXiv Detail & Related papers (2021-03-16T17:40:12Z) - Multi-View Optimization of Local Feature Geometry [70.18863787469805]
We address the problem of refining the geometry of local image features from multiple views without known scene or camera geometry.
Our proposed method naturally complements the traditional feature extraction and matching paradigm.
We show that our method consistently improves the triangulation and camera localization performance for both hand-crafted and learned local features.
arXiv Detail & Related papers (2020-03-18T17:22:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.