Robust Incremental Structure-from-Motion with Hybrid Features
- URL: http://arxiv.org/abs/2409.19811v1
- Date: Sun, 29 Sep 2024 22:20:32 GMT
- Title: Robust Incremental Structure-from-Motion with Hybrid Features
- Authors: Shaohui Liu, Yidan Gao, Tianyi Zhang, Rémi Pautrat, Johannes L. Schönberger, Viktor Larsson, Marc Pollefeys,
- Abstract summary: We introduce an incremental Structure-from-Motion (SfM) system that leverages lines and their structured geometric relations.
Our system is consistently more robust and accurate compared to the widely used point-based state of the art in SfM.
- Score: 73.55745864762703
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Structure-from-Motion (SfM) has become a ubiquitous tool for camera calibration and scene reconstruction with many downstream applications in computer vision and beyond. While the state-of-the-art SfM pipelines have reached a high level of maturity in well-textured and well-configured scenes over the last decades, they still fall short of robustly solving the SfM problem in challenging scenarios. In particular, weakly textured scenes and poorly constrained configurations oftentimes cause catastrophic failures or large errors for the primarily keypoint-based pipelines. In these scenarios, line segments are often abundant and can offer complementary geometric constraints. Their large spatial extent and typically structured configurations lead to stronger geometric constraints as compared to traditional keypoint-based methods. In this work, we introduce an incremental SfM system that, in addition to points, leverages lines and their structured geometric relations. Our technical contributions span the entire pipeline (mapping, triangulation, registration) and we integrate these into a comprehensive end-to-end SfM system that we share as an open-source software with the community. We also present the first analytical method to propagate uncertainties for 3D optimized lines via sensitivity analysis. Experiments show that our system is consistently more robust and accurate compared to the widely used point-based state of the art in SfM -- achieving richer maps and more precise camera registrations, especially under challenging conditions. In addition, our uncertainty-aware localization module alone is able to consistently improve over the state of the art under both point-alone and hybrid setups.
Related papers
- MASt3R-SfM: a Fully-Integrated Solution for Unconstrained Structure-from-Motion [12.602510002753815]
We build upon a recently released foundation model for 3D vision that can robustly produce local 3D reconstructions and accurate matches.
We introduce a low-memory approach to accurately align these local reconstructions in a global coordinate system.
Our novel SfM pipeline is simple, scalable, fast and truly unconstrained, i.e. it can handle any collection of images, ordered or not.
arXiv Detail & Related papers (2024-09-27T21:29:58Z) - Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion [57.232688209606515]
We present HTCL, a novel Temporal Temporal Context Learning paradigm for improving camera-based semantic scene completion.
Our method ranks $1st$ on the Semantic KITTI benchmark and even surpasses LiDAR-based methods in terms of mIoU.
arXiv Detail & Related papers (2024-07-02T09:11:17Z) - GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images.
We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage.
We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z) - Distributed Global Structure-from-Motion with a Deep Front-End [11.2064188838227]
We investigate whether leveraging the developments in feature extraction and matching helps global SfM perform on par with the SOTA incremental SfM approach (COLMAP)
Our SfM system is designed from the ground up to leverage distributed computation, enabling us to parallelize computation on multiple machines and scale to large scenes.
arXiv Detail & Related papers (2023-11-30T18:47:18Z) - Global-to-Local Modeling for Video-based 3D Human Pose and Shape
Estimation [53.04781510348416]
Video-based 3D human pose and shape estimations are evaluated by intra-frame accuracy and inter-frame smoothness.
We propose to structurally decouple the modeling of long-term and short-term correlations in an end-to-end framework, Global-to-Local Transformer (GLoT)
Our GLoT surpasses previous state-of-the-art methods with the lowest model parameters on popular benchmarks, i.e., 3DPW, MPI-INF-3DHP, and Human3.6M.
arXiv Detail & Related papers (2023-03-26T14:57:49Z) - AdaSfM: From Coarse Global to Fine Incremental Adaptive Structure from
Motion [48.835456049755166]
AdaSfM is a coarse-to-fine adaptive SfM approach that is scalable to large-scale and challenging datasets.
Our approach first does a coarse global SfM which improves the reliability of the view graph by leveraging measurements from low-cost sensors.
Our approach uses a threshold-adaptive strategy to align all local reconstructions to the coordinate frame of global SfM.
arXiv Detail & Related papers (2023-01-28T09:06:50Z) - Structure PLP-SLAM: Efficient Sparse Mapping and Localization using
Point, Line and Plane for Monocular, RGB-D and Stereo Cameras [13.693353009049773]
This paper demonstrates a visual SLAM system that utilizes point and line cloud for robust camera localization, simultaneously, with an embedded piece-wise planar reconstruction (PPR) module.
We address the challenge of reconstructing geometric primitives with scale ambiguity by proposing several run-time optimizations on the reconstructed lines and planes.
The results show that our proposed SLAM tightly incorporates the semantic features to boost both tracking as well as backend optimization.
arXiv Detail & Related papers (2022-07-13T09:05:35Z) - Residual Multiplicative Filter Networks for Multiscale Reconstruction [24.962697695403037]
We introduce a new coordinate network architecture and training scheme that enables coarse-to-fine optimization with fine-grained control over the frequency support of learned reconstructions.
We demonstrate how these modifications enable multiscale optimization for coarse-to-fine fitting to natural images.
We then evaluate our model on synthetically generated datasets for the the problem of single-particle cryo-EM reconstruction.
arXiv Detail & Related papers (2022-06-01T20:16:28Z) - Pushing the Envelope of Rotation Averaging for Visual SLAM [69.7375052440794]
We propose a novel optimization backbone for visual SLAM systems.
We leverage averaging to improve the accuracy, efficiency and robustness of conventional monocular SLAM systems.
Our approach can exhibit up to 10x faster with comparable accuracy against the state-art on public benchmarks.
arXiv Detail & Related papers (2020-11-02T18:02:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.