HS-SLAM: Hybrid Representation with Structural Supervision for Improved Dense SLAM
- URL: http://arxiv.org/abs/2503.21778v1
- Date: Thu, 27 Mar 2025 17:59:54 GMT
- Title: HS-SLAM: Hybrid Representation with Structural Supervision for Improved Dense SLAM
- Authors: Ziren Gong, Fabio Tosi, Youmin Zhang, Stefano Mattoccia, Matteo Poggi,
- Abstract summary: NeRF-based SLAM has recently achieved promising results in tracking and reconstruction.<n>We present HS-SLAM to tackle these problems.<n>We propose a hybrid encoding network that combines the complementary strengths of hash-grid, tri-planes, and one-blob.
- Score: 38.82194947459594
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: NeRF-based SLAM has recently achieved promising results in tracking and reconstruction. However, existing methods face challenges in providing sufficient scene representation, capturing structural information, and maintaining global consistency in scenes emerging significant movement or being forgotten. To this end, we present HS-SLAM to tackle these problems. To enhance scene representation capacity, we propose a hybrid encoding network that combines the complementary strengths of hash-grid, tri-planes, and one-blob, improving the completeness and smoothness of reconstruction. Additionally, we introduce structural supervision by sampling patches of non-local pixels rather than individual rays to better capture the scene structure. To ensure global consistency, we implement an active global bundle adjustment (BA) to eliminate camera drifts and mitigate accumulative errors. Experimental results demonstrate that HS-SLAM outperforms the baselines in tracking and reconstruction accuracy while maintaining the efficiency required for robotics.
Related papers
- GPSMamba: A Global Phase and Spectral Prompt-guided Mamba for Infrared Image Super-Resolution [4.063682271487617]
Infrared Image Super-Resolution is challenged by the low contrast and sparse textures of infrared data.<n>GPSMamba is a framework that synergizes architectural guidance with non-causal supervision.
arXiv Detail & Related papers (2025-07-25T06:56:16Z) - R3GS: Gaussian Splatting for Robust Reconstruction and Relocalization in Unconstrained Image Collections [9.633163304379861]
R3GS is a robust reconstruction and relocalization framework tailored for unconstrained datasets.<n>To mitigate the adverse effects of transient objects on the reconstruction process, we ffne-tune a lightweight human detection network.<n>To address the challenges posed by sky regions in outdoor scenes, we propose an effective sky-handling technique that incorporates a depth prior as a constraint.
arXiv Detail & Related papers (2025-05-21T09:25:22Z) - Event-Enhanced Blurry Video Super-Resolution [52.894824081586776]
We tackle the task of blurry video super-resolution (BVSR), aiming to generate high-resolution (HR) videos from low-resolution (LR) and blurry inputs.<n>Current BVSR methods often fail to restore sharp details at high resolutions, resulting in noticeable artifacts and jitter.<n>We introduce event signals into BVSR and propose a novel event-enhanced network, Ev-DeVSR.
arXiv Detail & Related papers (2025-04-17T15:55:41Z) - Deblur Gaussian Splatting SLAM [57.35366732452066]
Deblur-SLAM is a robust RGB SLAM pipeline designed to recover sharp reconstructions from motion-blurred inputs.<n>We model the physical image formation process of motion-blurred images and minimize the error between the observed blurry images and rendered blurry images.<n>We achieve state-of-the-art results for sharp map estimation and sub-frame trajectory recovery both on synthetic and real-world blurry input data.
arXiv Detail & Related papers (2025-03-16T16:59:51Z) - Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization [27.509109317973817]
3D Gaussian Splatting (3DGS) has garnered significant attention for its high-quality rendering and fast inference speed.
Previous methods primarily focus on geometry regularization, with common approaches including primitive-based and dual-model frameworks.
We propose CarGS, a unified model leveraging-adaptive regularization to achieve simultaneous, high-quality surface reconstruction.
arXiv Detail & Related papers (2025-03-02T12:51:38Z) - Unleashing Correlation and Continuity for Hyperspectral Reconstruction from RGB Images [64.80875911446937]
We propose a Correlation and Continuity Network (CCNet) for HSI reconstruction from RGB images.<n>For the correlation of local spectrum, we introduce the Group-wise Spectral Correlation Modeling (GrSCM) module.<n>For the continuity of global spectrum, we design the Neighborhood-wise Spectral Continuity Modeling (NeSCM) module.
arXiv Detail & Related papers (2025-01-02T15:14:40Z) - SMORE: Simulataneous Map and Object REconstruction [66.66729715211642]
We present a method for dynamic surface reconstruction of large-scale urban scenes from LiDAR.<n>We take a holistic perspective and optimize a compositional model of a dynamic scene that decomposes the world into rigidly-moving objects and the background.
arXiv Detail & Related papers (2024-06-19T23:53:31Z) - MG-SLAM: Structure Gaussian Splatting SLAM with Manhattan World Hypothesis [11.324845082176559]
We present Manhattan Gaussian SLAM, an RGB-D system that enhances geometric accuracy and completeness.<n>By seamlessly integrating fused line segments derived from structured scenes, our method ensures robust tracking in textureless indoor areas.<n>Experiments conducted on both synthetic and real-world scenes demonstrate that these advancements enable our method to achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-05-30T13:16:17Z) - Low-Light Video Enhancement via Spatial-Temporal Consistent Illumination and Reflection Decomposition [68.6707284662443]
Low-Light Video Enhancement (LLVE) seeks to restore dynamic and static scenes plagued by severe invisibility and noise.
One critical aspect is formulating a consistency constraint specifically for temporal-spatial illumination and appearance enhanced versions.
We present an innovative video Retinex-based decomposition strategy that operates without the need for explicit supervision.
arXiv Detail & Related papers (2024-05-24T15:56:40Z) - SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM [5.144010652281121]
We present SGS-SLAM, the first semantic visual SLAM system based on Splatting.
It appearance geometry, and semantic features through multi-channel optimization, addressing the oversmoothing limitations of neural implicit SLAM systems.
It delivers state-of-the-art performance in camera pose estimation, map reconstruction, precise semantic segmentation, and object-level geometric accuracy.
arXiv Detail & Related papers (2024-02-05T18:03:53Z) - GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction [45.49960166785063]
GO-SLAM is a deep-learning-based dense visual SLAM framework globally optimizing poses and 3D reconstruction in real-time.
Results on various synthetic and real-world datasets demonstrate that GO-SLAM outperforms state-of-the-art approaches at tracking robustness and reconstruction accuracy.
arXiv Detail & Related papers (2023-09-05T17:59:58Z) - Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural
Real-Time SLAM [14.56883275492083]
Co-SLAM is an RGB-D SLAM system based on a hybrid representation.
It performs robust camera tracking and high-fidelity surface reconstruction in real time.
arXiv Detail & Related papers (2023-04-27T17:46:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.