CuSfM: CUDA-Accelerated Structure-from-Motion
- URL: http://arxiv.org/abs/2510.15271v1
- Date: Fri, 17 Oct 2025 03:29:11 GMT
- Title: CuSfM: CUDA-Accelerated Structure-from-Motion
- Authors: Jingrui Yu, Jun Liu, Kefei Ren, Joydeep Biswas, Rurui Ye, Keqiang Wu, Chirag Majithia, Di Zeng,
- Abstract summary: cuSfM is a computationally-accelerated offline Structure-from-Motion system.<n>It generates comprehensive and non-redundant data associations for precise camera pose estimation and globally consistent mapping.<n>The system is released as an open-source Python implementation, PyCuSfM, to facilitate research and applications in computer vision and robotics.
- Score: 13.047004116582423
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficient and accurate camera pose estimation forms the foundational requirement for dense reconstruction in autonomous navigation, robotic perception, and virtual simulation systems. This paper addresses the challenge via cuSfM, a CUDA-accelerated offline Structure-from-Motion system that leverages GPU parallelization to efficiently employ computationally intensive yet highly accurate feature extractors, generating comprehensive and non-redundant data associations for precise camera pose estimation and globally consistent mapping. The system supports pose optimization, mapping, prior-map localization, and extrinsic refinement. It is designed for offline processing, where computational resources can be fully utilized to maximize accuracy. Experimental results demonstrate that cuSfM achieves significantly improved accuracy and processing speed compared to the widely used COLMAP method across various testing scenarios, while maintaining the high precision and global consistency essential for offline SfM applications. The system is released as an open-source Python wrapper implementation, PyCuSfM, available at https://github.com/nvidia-isaac/pyCuSFM, to facilitate research and applications in computer vision and robotics.
Related papers
- DPVO-QAT++: Heterogeneous QAT and CUDA Kernel Fusion for High-Performance Deep Patch Visual Odometry [0.8122270502556375]
This paper presents a hierarchical quantization optimization framework, DPVO-QAT++ (DPVO-QAT++: Heterogeneous QAT and Kernel Fusion for High-Performance Deep Patch Visual Odometry)
arXiv Detail & Related papers (2025-11-16T15:38:25Z) - InstantSfM: Fully Sparse and Parallel Structure-from-Motion [18.540622250926624]
Structure-from-Motion (SfM) is a method that recovers camera poses and scene geometry from uncalibrated images.<n> GLOMAP, naive CPU-specialized implementations of bundle adjustment (BA) or global positioning (GP) introduce significant computational overhead when handling large-scale scenarios.<n>In this paper, we unleash the full potential of GPU parallel computation to accelerate each critical stage of the standard SfM pipeline.
arXiv Detail & Related papers (2025-10-15T08:58:05Z) - PICT -- A Differentiable, GPU-Accelerated Multi-Block PISO Solver for Simulation-Coupled Learning Tasks in Fluid Dynamics [59.38498811984876]
We present our fluid simulator PICT, a differentiable pressure-implicit solver coded in PyTorch with Graphics-processing-unit (GPU) support.<n>We first verify the accuracy of both the forward simulation and our derived gradients in various established benchmarks.<n>We show that the gradients provided by our solver can be used to learn complicated turbulence models in 2D and 3D.
arXiv Detail & Related papers (2025-05-22T17:55:10Z) - Efficient Transformed Gaussian Process State-Space Models for Non-Stationary High-Dimensional Dynamical Systems [49.819436680336786]
We propose an efficient transformed Gaussian process state-space model (ETGPSSM) for scalable and flexible modeling of high-dimensional, non-stationary dynamical systems.<n>Specifically, our ETGPSSM integrates a single shared GP with input-dependent normalizing flows, yielding an expressive implicit process prior that captures complex, non-stationary transition dynamics.<n>Our ETGPSSM outperforms existing GPSSMs and neural network-based SSMs in terms of computational efficiency and accuracy.
arXiv Detail & Related papers (2025-03-24T03:19:45Z) - QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge [55.75103034526652]
We propose QuartDepth which adopts post-training quantization to quantize MDE models with hardware accelerations for ASICs.<n>Our approach involves quantizing both weights and activations to 4-bit precision, reducing the model size and computation cost.<n>We design a flexible and programmable hardware accelerator by supporting kernel fusion and customized instruction programmability.
arXiv Detail & Related papers (2025-03-20T21:03:10Z) - DQO-MAP: Dual Quadrics Multi-Object mapping with Gaussian Splatting [6.736949053673975]
We propose a novel object-SLAM system that seamlessly integrates object pose estimation and reconstruction.<n>DQO-MAP achieves outstanding performance in terms of precision, reconstruction quality, and computational efficiency.
arXiv Detail & Related papers (2025-03-04T02:55:07Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Local object crop collision network for efficient simulation of
non-convex objects in GPU-based simulators [6.33790920152602]
Our goal is to develop an efficient contact detection algorithm for large-scale simulation of non-network objects.
We propose a data-driven approach for CD, whose accuracy depends only on the quality and quantity of supplementary materials.
arXiv Detail & Related papers (2023-04-19T06:09:12Z) - AdaSfM: From Coarse Global to Fine Incremental Adaptive Structure from
Motion [48.835456049755166]
AdaSfM is a coarse-to-fine adaptive SfM approach that is scalable to large-scale and challenging datasets.
Our approach first does a coarse global SfM which improves the reliability of the view graph by leveraging measurements from low-cost sensors.
Our approach uses a threshold-adaptive strategy to align all local reconstructions to the coordinate frame of global SfM.
arXiv Detail & Related papers (2023-01-28T09:06:50Z) - Transformer-based Context Condensation for Boosting Feature Pyramids in
Object Detection [77.50110439560152]
Current object detectors typically have a feature pyramid (FP) module for multi-level feature fusion (MFF)
We propose a novel and efficient context modeling mechanism that can help existing FPs deliver better MFF results.
In particular, we introduce a novel insight that comprehensive contexts can be decomposed and condensed into two types of representations for higher efficiency.
arXiv Detail & Related papers (2022-07-14T01:45:03Z) - CUDA-Optimized real-time rendering of a Foveated Visual System [5.260841516691153]
We present a technique that exploits the GPU to efficiently generate Gaussian-based foveated images at high definition (1920x1080) in real-time (165 Hz)
Our algorithm can meet demand for spatially-varying processing across biological artificial agents so that foveation can be added easily on top of existing systems.
arXiv Detail & Related papers (2020-12-15T22:43:04Z) - Image Matching across Wide Baselines: From Paper to Practice [80.9424750998559]
We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
arXiv Detail & Related papers (2020-03-03T15:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.