Related papers: cuVSLAM: CUDA accelerated visual odometry and mapping

cuVSLAM: CUDA accelerated visual odometry and mapping

URL: http://arxiv.org/abs/2506.04359v3
Date: Tue, 08 Jul 2025 16:53:53 GMT
Title: cuVSLAM: CUDA accelerated visual odometry and mapping
Authors: Alexander Korovko, Dmitry Slepichev, Alexander Efitorov, Aigul Dzhumamuratova, Viktor Kuznetsov, Hesam Rabeti, Joydeep Biswas, Soha Pouya,
Abstract summary: cuVSLAM is a state-of-the-art solution for visual simultaneous localization and mapping.<n>It can operate with a variety of visual-inertial sensor suites, including multiple RGB and depth cameras, and inertial measurement units.
Score: 72.43057259584663
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Accurate and robust pose estimation is a key requirement for any autonomous robot. We present cuVSLAM, a state-of-the-art solution for visual simultaneous localization and mapping, which can operate with a variety of visual-inertial sensor suites, including multiple RGB and depth cameras, and inertial measurement units. cuVSLAM supports operation with as few as one RGB camera to as many as 32 cameras, in arbitrary geometric configurations, thus supporting a wide range of robotic setups. cuVSLAM is specifically optimized using CUDA to deploy in real-time applications with minimal computational overhead on edge-computing devices such as the NVIDIA Jetson. We present the design and implementation of cuVSLAM, example use cases, and empirical results on several state-of-the-art benchmarks demonstrating the best-in-class performance of cuVSLAM.

Related papers

MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements [59.70107451308687]
We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM. Our method, MM3DGS, addresses the limitations of prior rendering by enabling faster scale awareness, and improved trajectory tracking. We also release a multi-modal dataset, UT-MM, collected from a mobile robot equipped with a camera and an inertial measurement unit.
arXiv Detail & Related papers (2024-04-01T04:57:41Z)
U-ARE-ME: Uncertainty-Aware Rotation Estimation in Manhattan Environments [18.534567960292403]
We present U-ARE-ME, an algorithm that estimates camera rotation along with uncertainty from uncalibrated RGB images. Our experiments demonstrate that U-ARE-ME performs comparably to RGB-D methods and is more robust than sparse feature-based SLAM methods.
arXiv Detail & Related papers (2024-03-22T19:14:28Z)
Anyview: Generalizable Indoor 3D Object Detection with Variable Frames [63.51422844333147]
We present a novel 3D detection framework named AnyView for our practical applications. Our method achieves both great generalizability and high detection accuracy with a simple and clean architecture.
arXiv Detail & Related papers (2023-10-09T02:15:45Z)
Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner. Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping. Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z)
Performance Optimization using Multimodal Modeling and Heterogeneous GNN [1.304892050913381]
We propose a technique for tuning parallel code regions that is general enough to be adapted to multiple tasks. In this paper, we analyze IR-based programming models to make task-specific performance optimizations. Our experiments show that this multimodal learning based approach outperforms the state-of-the-art in all experiments.
arXiv Detail & Related papers (2023-04-25T04:27:43Z)
NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM [111.83168930989503]
NICER-SLAM is a dense RGB SLAM system that simultaneously optimize for camera poses and a hierarchical neural implicit map representation. We show strong performance in dense mapping, tracking, and novel view synthesis, even competitive with recent RGB-D SLAM systems.
arXiv Detail & Related papers (2023-02-07T17:06:34Z)
Visual Odometry for RGB-D Cameras [3.655021726150368]
This paper develops a quick and accurate approach to visual odometry of a moving RGB-D camera navigating on a static environment. The proposed algorithm uses SURF as feature extractor, RANSAC to filter the results and Minimum Mean Square to estimate the rigid transformation of six parameters between successive video frames.
arXiv Detail & Related papers (2022-03-28T21:49:12Z)
Unified Multi-Modal Landmark Tracking for Tightly Coupled Lidar-Visual-Inertial Odometry [5.131684964386192]
We present an efficient multi-sensor odometry system for mobile platforms that jointly optimize visual, lidar, and inertial information. New method to extract 3D line and planar primitives from lidar point clouds is presented. System has been tested on a variety of platforms and scenarios, including underground exploration with a legged robot and outdoor scanning with a dynamically moving handheld device.
arXiv Detail & Related papers (2020-11-13T09:54:03Z)
Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras. We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points. Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
Redesigning SLAM for Arbitrary Multi-Camera Systems [51.81798192085111]
Adding more cameras to SLAM systems improves robustness and accuracy but complicates the design of the visual front-end significantly. In this work, we aim at an adaptive SLAM system that works for arbitrary multi-camera setups. We adapt a state-of-the-art visual-inertial odometry with these modifications, and experimental results show that the modified pipeline can adapt to a wide range of camera setups.
arXiv Detail & Related papers (2020-03-04T11:44:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.