InstantSfM: Fully Sparse and Parallel Structure-from-Motion
- URL: http://arxiv.org/abs/2510.13310v1
- Date: Wed, 15 Oct 2025 08:58:05 GMT
- Title: InstantSfM: Fully Sparse and Parallel Structure-from-Motion
- Authors: Jiankun Zhong, Zitong Zhan, Quankai Gao, Ziyu Chen, Haozhe Lou, Jiageng Mao, Ulrich Neumann, Yue Wang,
- Abstract summary: Structure-from-Motion (SfM) is a method that recovers camera poses and scene geometry from uncalibrated images.<n> GLOMAP, naive CPU-specialized implementations of bundle adjustment (BA) or global positioning (GP) introduce significant computational overhead when handling large-scale scenarios.<n>In this paper, we unleash the full potential of GPU parallel computation to accelerate each critical stage of the standard SfM pipeline.
- Score: 18.540622250926624
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Structure-from-Motion (SfM), a method that recovers camera poses and scene geometry from uncalibrated images, is a central component in robotic reconstruction and simulation. Despite the state-of-the-art performance of traditional SfM methods such as COLMAP and its follow-up work, GLOMAP, naive CPU-specialized implementations of bundle adjustment (BA) or global positioning (GP) introduce significant computational overhead when handling large-scale scenarios, leading to a trade-off between accuracy and speed in SfM. Moreover, the blessing of efficient C++-based implementations in COLMAP and GLOMAP comes with the curse of limited flexibility, as they lack support for various external optimization options. On the other hand, while deep learning based SfM pipelines like VGGSfM and VGGT enable feed-forward 3D reconstruction, they are unable to scale to thousands of input views at once as GPU memory consumption increases sharply as the number of input views grows. In this paper, we unleash the full potential of GPU parallel computation to accelerate each critical stage of the standard SfM pipeline. Building upon recent advances in sparse-aware bundle adjustment optimization, our design extends these techniques to accelerate both BA and GP within a unified global SfM framework. Through extensive experiments on datasets of varying scales (e.g. 5000 images where VGGSfM and VGGT run out of memory), our method demonstrates up to about 40 times speedup over COLMAP while achieving consistently comparable or even improved reconstruction accuracy. Our project page can be found at https://cre185.github.io/InstantSfM/.
Related papers
- ImLoc: Revisiting Visual Localization with Image-based Representation [61.282162006394934]
We propose to augment each image with estimated depth maps to capture the geometric structure.<n>This representation is easy to build and maintain, but achieves highest accuracy in challenging conditions.<n>Our method achieves a new state-of-the-art accuracy on various standard benchmarks and outperforms existing memory-efficient methods at comparable map sizes.
arXiv Detail & Related papers (2026-01-07T18:51:51Z) - CuSfM: CUDA-Accelerated Structure-from-Motion [13.047004116582423]
cuSfM is a computationally-accelerated offline Structure-from-Motion system.<n>It generates comprehensive and non-redundant data associations for precise camera pose estimation and globally consistent mapping.<n>The system is released as an open-source Python implementation, PyCuSfM, to facilitate research and applications in computer vision and robotics.
arXiv Detail & Related papers (2025-10-17T03:29:11Z) - FastMap: Revisiting Structure from Motion through First-Order Optimization [26.930994695116198]
We propose FastMap, a new global structure from motion method focused on speed and simplicity.<n>We show that FastMap is up to 10 times faster than COLMAP and GLOMAP with GPU acceleration and achieves comparable pose accuracy.
arXiv Detail & Related papers (2025-05-07T17:56:15Z) - A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds [37.043012716944496]
We introduce a constrained optimization method for simultaneous camera pose estimation and 3D reconstruction.<n> Experiments demonstrate that the proposed method significantly outperforms the existing (multi-modal) 3DGS baseline.
arXiv Detail & Related papers (2025-04-12T08:34:43Z) - Light3R-SfM: Towards Feed-forward Structure-from-Motion [34.47706116389972]
Light3R-SfM is a feed-forward, end-to-end learnable framework for efficient large-scale Structure-from-Motion.<n>This work pioneers a data-driven, feed-forward SfM approach, paving the way toward scalable, accurate, and efficient 3D reconstruction in the wild.
arXiv Detail & Related papers (2025-01-24T20:46:04Z) - Robust Incremental Structure-from-Motion with Hybrid Features [73.55745864762703]
We introduce an incremental Structure-from-Motion (SfM) system that leverages lines and their structured geometric relations.
Our system is consistently more robust and accurate compared to the widely used point-based state of the art in SfM.
arXiv Detail & Related papers (2024-09-29T22:20:32Z) - Global Structure-from-Motion Revisited [57.30100303979393]
We propose GLOMAP as a new general-purpose system that outperforms the state of the art in global SfM.
In terms of accuracy and robustness, we achieve results on-par or superior to COLMAP, the most widely used incremental SfM.
We share our system as an open-source implementation.
arXiv Detail & Related papers (2024-07-29T17:54:24Z) - Distributed Global Structure-from-Motion with a Deep Front-End [11.2064188838227]
We investigate whether leveraging the developments in feature extraction and matching helps global SfM perform on par with the SOTA incremental SfM approach (COLMAP)
Our SfM system is designed from the ground up to leverage distributed computation, enabling us to parallelize computation on multiple machines and scale to large scenes.
arXiv Detail & Related papers (2023-11-30T18:47:18Z) - AdaSfM: From Coarse Global to Fine Incremental Adaptive Structure from
Motion [48.835456049755166]
AdaSfM is a coarse-to-fine adaptive SfM approach that is scalable to large-scale and challenging datasets.
Our approach first does a coarse global SfM which improves the reliability of the view graph by leveraging measurements from low-cost sensors.
Our approach uses a threshold-adaptive strategy to align all local reconstructions to the coordinate frame of global SfM.
arXiv Detail & Related papers (2023-01-28T09:06:50Z) - Image-specific Convolutional Kernel Modulation for Single Image
Super-resolution [85.09413241502209]
In this issue, we propose a novel image-specific convolutional modulation kernel (IKM)
We exploit the global contextual information of image or feature to generate an attention weight for adaptively modulating the convolutional kernels.
Experiments on single image super-resolution show that the proposed methods achieve superior performances over state-of-the-art methods.
arXiv Detail & Related papers (2021-11-16T11:05:10Z) - Image Matching across Wide Baselines: From Paper to Practice [80.9424750998559]
We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
arXiv Detail & Related papers (2020-03-03T15:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.