HS-SLAM: Hybrid Representation with Structural Supervision for Improved Dense SLAM
- URL: http://arxiv.org/abs/2503.21778v1
- Date: Thu, 27 Mar 2025 17:59:54 GMT
- Title: HS-SLAM: Hybrid Representation with Structural Supervision for Improved Dense SLAM
- Authors: Ziren Gong, Fabio Tosi, Youmin Zhang, Stefano Mattoccia, Matteo Poggi,
- Abstract summary: NeRF-based SLAM has recently achieved promising results in tracking and reconstruction.<n>We present HS-SLAM to tackle these problems.<n>We propose a hybrid encoding network that combines the complementary strengths of hash-grid, tri-planes, and one-blob.
- Score: 38.82194947459594
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: NeRF-based SLAM has recently achieved promising results in tracking and reconstruction. However, existing methods face challenges in providing sufficient scene representation, capturing structural information, and maintaining global consistency in scenes emerging significant movement or being forgotten. To this end, we present HS-SLAM to tackle these problems. To enhance scene representation capacity, we propose a hybrid encoding network that combines the complementary strengths of hash-grid, tri-planes, and one-blob, improving the completeness and smoothness of reconstruction. Additionally, we introduce structural supervision by sampling patches of non-local pixels rather than individual rays to better capture the scene structure. To ensure global consistency, we implement an active global bundle adjustment (BA) to eliminate camera drifts and mitigate accumulative errors. Experimental results demonstrate that HS-SLAM outperforms the baselines in tracking and reconstruction accuracy while maintaining the efficiency required for robotics.
Related papers
- GSM-GS: Geometry-Constrained Single and Multi-view Gaussian Splatting for Surface Reconstruction [16.96307929629197]
unstructured and irregular nature of Gaussian point clouds poses challenges to reconstruction accuracy.<n>We propose GSM-GS: a synergistic optimization framework integrating single-view adaptive sub-region weighting constraints and multi-view spatial structure refinement.<n>Our method achieves both competitive rendering quality and geometric reconstruction.
arXiv Detail & Related papers (2026-02-13T10:26:32Z) - SING3R-SLAM: Submap-based Indoor Monocular Gaussian SLAM with 3D Reconstruction Priors [80.51557267896938]
SING3R-SLAM is a globally consistent and compact Gaussian-based dense RGB SLAM framework.<n>We show that SING3R-SLAM achieves state-of-the-art tracking, 3D reconstruction, and novel view rendering, resulting in over 12% improvement in tracking and producing finer, more detailed geometry.
arXiv Detail & Related papers (2025-11-21T12:40:55Z) - RoGER-SLAM: A Robust Gaussian Splatting SLAM System for Noisy and Low-light Environment Resilience [4.942278642834429]
RoGER-SLAM is a robust 3DGS SLAM system tailored for noise and low-light resilience.<n>We show that RoGER-SLAM consistently improves trajectory accuracy and reconstruction quality compared with other 3DGS-SLAM systems.
arXiv Detail & Related papers (2025-10-26T09:32:43Z) - Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction [53.26903617819014]
Flow-Matching-guided Unfolding network (FMU) is first to integrate flow matching into HSI reconstruction.<n>To further strengthen the learned dynamics, we introduce a mean velocity loss.<n>Experiments on both simulated and real datasets show that FMU significantly outperforms existing approaches in reconstruction quality.
arXiv Detail & Related papers (2025-10-02T11:32:00Z) - Progressive Flow-inspired Unfolding for Spectral Compressive Imaging [11.638690628451647]
Coded aperture snapshot spectral imaging (CASSI) retrieves a 3D hyperspectral image (HSI) from a single 2D compressed measurement.<n>Recent deep unfolding networks (DUNs) have achieved the state of the art in CASSI reconstruction.<n>Inspired by diffusion trajectories and flow matching, we propose a novel trajectory-controllable unfolding framework.
arXiv Detail & Related papers (2025-09-15T16:10:50Z) - GPSMamba: A Global Phase and Spectral Prompt-guided Mamba for Infrared Image Super-Resolution [4.063682271487617]
Infrared Image Super-Resolution is challenged by the low contrast and sparse textures of infrared data.<n>GPSMamba is a framework that synergizes architectural guidance with non-causal supervision.
arXiv Detail & Related papers (2025-07-25T06:56:16Z) - R3GS: Gaussian Splatting for Robust Reconstruction and Relocalization in Unconstrained Image Collections [9.633163304379861]
R3GS is a robust reconstruction and relocalization framework tailored for unconstrained datasets.<n>To mitigate the adverse effects of transient objects on the reconstruction process, we ffne-tune a lightweight human detection network.<n>To address the challenges posed by sky regions in outdoor scenes, we propose an effective sky-handling technique that incorporates a depth prior as a constraint.
arXiv Detail & Related papers (2025-05-21T09:25:22Z) - VRS-UIE: Value-Driven Reordering Scanning for Underwater Image Enhancement [104.78586859995333]
State Space Models (SSMs) have emerged as a promising backbone for vision tasks due to their linear complexity and global receptive field.<n>The predominance of large-portion, homogeneous but useless oceanic backgrounds can dilute the feature representation responses of sparse yet valuable targets.<n>We propose a novel Value-Driven Reordering Scanning framework for Underwater Image Enhancement (UIE)<n>Our framework sets a new state-of-the-art, delivering superior enhancement performance (surpassing WMamba by 0.89 dB on average) by effectively suppressing water bias and preserving structural and color fidelity.
arXiv Detail & Related papers (2025-05-02T12:21:44Z) - Event-Enhanced Blurry Video Super-Resolution [52.894824081586776]
We tackle the task of blurry video super-resolution (BVSR), aiming to generate high-resolution (HR) videos from low-resolution (LR) and blurry inputs.<n>Current BVSR methods often fail to restore sharp details at high resolutions, resulting in noticeable artifacts and jitter.<n>We introduce event signals into BVSR and propose a novel event-enhanced network, Ev-DeVSR.
arXiv Detail & Related papers (2025-04-17T15:55:41Z) - Deblur Gaussian Splatting SLAM [57.35366732452066]
Deblur-SLAM is a robust RGB SLAM pipeline designed to recover sharp reconstructions from motion-blurred inputs.<n>We model the physical image formation process of motion-blurred images and minimize the error between the observed blurry images and rendered blurry images.<n>We achieve state-of-the-art results for sharp map estimation and sub-frame trajectory recovery both on synthetic and real-world blurry input data.
arXiv Detail & Related papers (2025-03-16T16:59:51Z) - Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization [27.509109317973817]
3D Gaussian Splatting (3DGS) has garnered significant attention for its high-quality rendering and fast inference speed.
Previous methods primarily focus on geometry regularization, with common approaches including primitive-based and dual-model frameworks.
We propose CarGS, a unified model leveraging-adaptive regularization to achieve simultaneous, high-quality surface reconstruction.
arXiv Detail & Related papers (2025-03-02T12:51:38Z) - Unleashing Correlation and Continuity for Hyperspectral Reconstruction from RGB Images [64.80875911446937]
We propose a Correlation and Continuity Network (CCNet) for HSI reconstruction from RGB images.<n>For the correlation of local spectrum, we introduce the Group-wise Spectral Correlation Modeling (GrSCM) module.<n>For the continuity of global spectrum, we design the Neighborhood-wise Spectral Continuity Modeling (NeSCM) module.
arXiv Detail & Related papers (2025-01-02T15:14:40Z) - SMORE: Simulataneous Map and Object REconstruction [66.66729715211642]
We present a method for dynamic surface reconstruction of large-scale urban scenes from LiDAR.<n>We take a holistic perspective and optimize a compositional model of a dynamic scene that decomposes the world into rigidly-moving objects and the background.
arXiv Detail & Related papers (2024-06-19T23:53:31Z) - MG-SLAM: Structure Gaussian Splatting SLAM with Manhattan World Hypothesis [11.324845082176559]
We present Manhattan Gaussian SLAM, an RGB-D system that enhances geometric accuracy and completeness.<n>By seamlessly integrating fused line segments derived from structured scenes, our method ensures robust tracking in textureless indoor areas.<n>Experiments conducted on both synthetic and real-world scenes demonstrate that these advancements enable our method to achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-05-30T13:16:17Z) - Low-Light Video Enhancement via Spatial-Temporal Consistent Illumination and Reflection Decomposition [68.6707284662443]
Low-Light Video Enhancement (LLVE) seeks to restore dynamic and static scenes plagued by severe invisibility and noise.
One critical aspect is formulating a consistency constraint specifically for temporal-spatial illumination and appearance enhanced versions.
We present an innovative video Retinex-based decomposition strategy that operates without the need for explicit supervision.
arXiv Detail & Related papers (2024-05-24T15:56:40Z) - SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM [5.144010652281121]
We present SGS-SLAM, the first semantic visual SLAM system based on Splatting.
It appearance geometry, and semantic features through multi-channel optimization, addressing the oversmoothing limitations of neural implicit SLAM systems.
It delivers state-of-the-art performance in camera pose estimation, map reconstruction, precise semantic segmentation, and object-level geometric accuracy.
arXiv Detail & Related papers (2024-02-05T18:03:53Z) - GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction [45.49960166785063]
GO-SLAM is a deep-learning-based dense visual SLAM framework globally optimizing poses and 3D reconstruction in real-time.
Results on various synthetic and real-world datasets demonstrate that GO-SLAM outperforms state-of-the-art approaches at tracking robustness and reconstruction accuracy.
arXiv Detail & Related papers (2023-09-05T17:59:58Z) - Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural
Real-Time SLAM [14.56883275492083]
Co-SLAM is an RGB-D SLAM system based on a hybrid representation.
It performs robust camera tracking and high-fidelity surface reconstruction in real time.
arXiv Detail & Related papers (2023-04-27T17:46:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.