Related papers: LiDAR-BIND-T: Improved and Temporally Consistent Sensor Modality Translation and Fusion for Robotic Applications

LiDAR-BIND-T: Improved and Temporally Consistent Sensor Modality Translation and Fusion for Robotic Applications

URL: http://arxiv.org/abs/2509.05728v3
Date: Tue, 30 Sep 2025 13:10:11 GMT
Title: LiDAR-BIND-T: Improved and Temporally Consistent Sensor Modality Translation and Fusion for Robotic Applications
Authors: Niels Balemans, Ali Anwar, Jan Steckel, Siegfried Mercelis,
Abstract summary: This paper extends LiDAR-BIND, a modular multi-modal fusion framework that binds heterogeneous sensors (radar, sonar) to a LiDAR-defined latent space.<n>We introduce three contributions: (i) temporal embedding similarity that aligns consecutive latent representations, (ii) a motion-aligned transformation loss that matches displacement between predictions and ground truth LiDAR, and (iii) windowed temporal fusion using a specialised temporal module.
Score: 2.112132378217468
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper extends LiDAR-BIND, a modular multi-modal fusion framework that binds heterogeneous sensors (radar, sonar) to a LiDAR-defined latent space, with mechanisms that explicitly enforce temporal consistency. We introduce three contributions: (i) temporal embedding similarity that aligns consecutive latent representations, (ii) a motion-aligned transformation loss that matches displacement between predictions and ground truth LiDAR, and (iii) windowed temporal fusion using a specialised temporal module. We further update the model architecture to better preserve spatial structure. Evaluations on radar/sonar-to-LiDAR translation demonstrate improved temporal and spatial coherence, yielding lower absolute trajectory error and better occupancy map accuracy in Cartographer-based SLAM (Simultaneous Localisation and Mapping). We propose different metrics based on the Fr\'echet Video Motion Distance (FVMD) and a correlation-peak distance metric providing practical temporal quality indicators to evaluate SLAM performance. The proposed temporal LiDAR-BIND, or LiDAR-BIND-T, maintains modular modality fusion while substantially enhancing temporal stability, resulting in improved robustness and performance for downstream SLAM.

Related papers

DVLO4D: Deep Visual-Lidar Odometry with Sparse Spatial-temporal Fusion [28.146811420532455]
We introduce DVLO4D, a novel visual-LiDAR odometry framework that leverages sparse spatial-temporal fusion to enhance accuracy and robustness.<n>Our method has high efficiency, with an inference time of 82 ms, possessing the potential for the real-time deployment.
arXiv Detail & Related papers (2025-09-07T11:43:11Z)
Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations [23.21118045286231]
LiMA is a novel framework that captures longer range temporal correlations to enhance LiDAR representation learning.<n>LiMA has high pretraining efficiency and incurs no additional computational overhead during downstream tasks.<n> experiments on mainstream LiDAR-based perception benchmarks demonstrate that LiMA significantly improves both LiDAR semantic segmentation and 3D object detection.
arXiv Detail & Related papers (2025-07-07T17:59:58Z)
SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining [62.433137130087445]
SuperFlow++ is a novel framework that integrates pretraining and downstream tasks using consecutive camera pairs.<n>We show that SuperFlow++ outperforms state-of-the-art methods across diverse tasks and driving conditions.<n>With strong generalizability and computational efficiency, SuperFlow++ establishes a new benchmark for data-efficient LiDAR-based perception in autonomous driving.
arXiv Detail & Related papers (2025-03-25T17:59:57Z)
Semantic-Supervised Spatial-Temporal Fusion for LiDAR-based 3D Object Detection [22.890432295751086]
LiDAR-based 3D object detection presents significant challenges due to the inherent sparsity of LiDAR points.<n>We propose a novel fusion module to relieve the spatial misalignment caused by the object motion over time.<n>We also propose a Semantic Injection method to enrich the sparse LiDAR data via injecting the point-wise semantic labels.
arXiv Detail & Related papers (2025-03-13T17:30:20Z)
Cross Space and Time: A Spatio-Temporal Unitized Model for Traffic Flow Forecasting [16.782154479264126]
Predicting backbone-temporal traffic flow presents challenges due to complex interactions between temporal factors. Existing approaches address these dimensions in isolation, neglecting their critical interdependencies. In this paper, we introduce Sanonymous-Temporal Unitized Unitized Cell (ASTUC), a unified framework designed to capture both spatial and temporal dependencies.
arXiv Detail & Related papers (2024-11-14T07:34:31Z)
LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [50.808933338389686]
We present LiDAR-GS, a real-time, high-fidelity re-simulation of LiDAR scans in public urban road scenes.<n>The method achieves state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z)
Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences [25.74000325019015]
We introduce a novel LiDAR 3D object detection framework, namely LiSTM, to facilitate spatial-temporal feature learning with cross-frame motion forecasting information. We have conducted experiments on the aggregation and nuScenes datasets to demonstrate that the proposed framework achieves superior 3D detection performance.
arXiv Detail & Related papers (2024-09-06T16:29:04Z)
DSLO: Deep Sequence LiDAR Odometry Based on Inconsistent Spatio-temporal Propagation [66.8732965660931]
paper introduces a 3D point cloud sequence learning model based on inconsistent-temporal propagation for LiDAR odometry DSLO. It consists of a pyramid structure with a sequential pose module, a hierarchical pose refinement module, and a temporal feature propagation module.
arXiv Detail & Related papers (2024-09-01T15:12:48Z)
Local-Global Temporal Difference Learning for Satellite Video Super-Resolution [53.03380679343968]
We propose to exploit the well-defined temporal difference for efficient and effective temporal compensation.<n>To fully utilize the local and global temporal information within frames, we systematically modeled the short-term and long-term temporal discrepancies.<n> Rigorous objective and subjective evaluations conducted across five mainstream video satellites demonstrate that our method performs favorably against state-of-the-art approaches.
arXiv Detail & Related papers (2023-04-10T07:04:40Z)
Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in Driving Scenes [82.4186966781934]
We introduce a simple, efficient, and effective two-stage detector, termed as Ret3D. At the core of Ret3D is the utilization of novel intra-frame and inter-frame relation modules. With negligible extra overhead, Ret3D achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-08-18T03:48:58Z)
Robust Self-Supervised LiDAR Odometry via Representative Structure Discovery and 3D Inherent Error Modeling [67.75095378830694]
We develop a two-stage odometry estimation network, where we obtain the ego-motion by estimating a set of sub-region transformations. In this paper, we aim to alleviate the influence of unreliable structures in training, inference and mapping phases. Our two-frame odometry outperforms the previous state of the arts by 16%/12% in terms of translational/rotational errors.
arXiv Detail & Related papers (2022-02-27T12:52:27Z)
Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition [62.46544616232238]
Previous motion recognition methods have achieved promising performance through the tightly coupled multi-temporal representation. We propose to decouple and recouple caused caused representation for RGB-D-based motion recognition.
arXiv Detail & Related papers (2021-12-16T18:59:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.