Distributed Global Structure-from-Motion with a Deep Front-End
- URL: http://arxiv.org/abs/2311.18801v1
- Date: Thu, 30 Nov 2023 18:47:18 GMT
- Title: Distributed Global Structure-from-Motion with a Deep Front-End
- Authors: Ayush Baid, John Lambert, Travis Driver, Akshay Krishnan, Hayk
Stepanyan, and Frank Dellaert
- Abstract summary: We investigate whether leveraging the developments in feature extraction and matching helps global SfM perform on par with the SOTA incremental SfM approach (COLMAP)
Our SfM system is designed from the ground up to leverage distributed computation, enabling us to parallelize computation on multiple machines and scale to large scenes.
- Score: 11.2064188838227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While initial approaches to Structure-from-Motion (SfM) revolved around both
global and incremental methods, most recent applications rely on incremental
systems to estimate camera poses due to their superior robustness. Though there
has been tremendous progress in SfM `front-ends' powered by deep models learned
from data, the state-of-the-art (incremental) SfM pipelines still rely on
classical SIFT features, developed in 2004. In this work, we investigate
whether leveraging the developments in feature extraction and matching helps
global SfM perform on par with the SOTA incremental SfM approach (COLMAP). To
do so, we design a modular SfM framework that allows us to easily combine
developments in different stages of the SfM pipeline. Our experiments show that
while developments in deep-learning based two-view correspondence estimation do
translate to improvements in point density for scenes reconstructed with global
SfM, none of them outperform SIFT when comparing with incremental SfM results
on a range of datasets. Our SfM system is designed from the ground up to
leverage distributed computation, enabling us to parallelize computation on
multiple machines and scale to large scenes.
Related papers
- EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference [49.94169109038806]
This paper introduces EPS-MoE, a novel expert pipeline scheduler for MoE.
Our results demonstrate an average 21% improvement in prefill throughput over existing parallel inference methods.
arXiv Detail & Related papers (2024-10-16T05:17:49Z) - Robust Incremental Structure-from-Motion with Hybrid Features [73.55745864762703]
We introduce an incremental Structure-from-Motion (SfM) system that leverages lines and their structured geometric relations.
Our system is consistently more robust and accurate compared to the widely used point-based state of the art in SfM.
arXiv Detail & Related papers (2024-09-29T22:20:32Z) - Global Structure-from-Motion Revisited [57.30100303979393]
We propose GLOMAP as a new general-purpose system that outperforms the state of the art in global SfM.
In terms of accuracy and robustness, we achieve results on-par or superior to COLMAP, the most widely used incremental SfM.
We share our system as an open-source implementation.
arXiv Detail & Related papers (2024-07-29T17:54:24Z) - Towards Scale-Aware Full Surround Monodepth with Transformers [46.100897032607335]
Full surround monodepth (FSM) methods can learn from multiple camera views simultaneously to predict the scale-aware depth.
In this work, we focus on enhancing the scale-awareness of FSM methods for depth estimation.
arXiv Detail & Related papers (2024-07-15T02:54:46Z) - AdaSfM: From Coarse Global to Fine Incremental Adaptive Structure from
Motion [48.835456049755166]
AdaSfM is a coarse-to-fine adaptive SfM approach that is scalable to large-scale and challenging datasets.
Our approach first does a coarse global SfM which improves the reliability of the view graph by leveraging measurements from low-cost sensors.
Our approach uses a threshold-adaptive strategy to align all local reconstructions to the coordinate frame of global SfM.
arXiv Detail & Related papers (2023-01-28T09:06:50Z) - DeepMLE: A Robust Deep Maximum Likelihood Estimator for Two-view
Structure from Motion [9.294501649791016]
Two-view structure from motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM (vSLAM)
We formulate the two-view SfM problem as a maximum likelihood estimation (MLE) and solve it with the proposed framework, denoted as DeepMLE.
Our method significantly outperforms the state-of-the-art end-to-end two-view SfM approaches in accuracy and generalization capability.
arXiv Detail & Related papers (2022-10-11T15:07:25Z) - Transformer-based Context Condensation for Boosting Feature Pyramids in
Object Detection [77.50110439560152]
Current object detectors typically have a feature pyramid (FP) module for multi-level feature fusion (MFF)
We propose a novel and efficient context modeling mechanism that can help existing FPs deliver better MFF results.
In particular, we introduce a novel insight that comprehensive contexts can be decomposed and condensed into two types of representations for higher efficiency.
arXiv Detail & Related papers (2022-07-14T01:45:03Z) - DeMFI: Deep Joint Deblurring and Multi-Frame Interpolation with
Flow-Guided Attentive Correlation and Recursive Boosting [50.17500790309477]
DeMFI-Net is a joint deblurring and multi-frame framework.
It converts blurry videos of lower-frame-rate to sharp videos at higher-frame-rate.
It achieves state-of-the-art (SOTA) performances for diverse datasets.
arXiv Detail & Related papers (2021-11-19T00:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.