LiDAR-BIND-T: Improved and Temporally Consistent Sensor Modality Translation and Fusion for Robotic Applications
- URL: http://arxiv.org/abs/2509.05728v3
- Date: Tue, 30 Sep 2025 13:10:11 GMT
- Title: LiDAR-BIND-T: Improved and Temporally Consistent Sensor Modality Translation and Fusion for Robotic Applications
- Authors: Niels Balemans, Ali Anwar, Jan Steckel, Siegfried Mercelis,
- Abstract summary: This paper extends LiDAR-BIND, a modular multi-modal fusion framework that binds heterogeneous sensors (radar, sonar) to a LiDAR-defined latent space.<n>We introduce three contributions: (i) temporal embedding similarity that aligns consecutive latent representations, (ii) a motion-aligned transformation loss that matches displacement between predictions and ground truth LiDAR, and (iii) windowed temporal fusion using a specialised temporal module.
- Score: 2.112132378217468
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper extends LiDAR-BIND, a modular multi-modal fusion framework that binds heterogeneous sensors (radar, sonar) to a LiDAR-defined latent space, with mechanisms that explicitly enforce temporal consistency. We introduce three contributions: (i) temporal embedding similarity that aligns consecutive latent representations, (ii) a motion-aligned transformation loss that matches displacement between predictions and ground truth LiDAR, and (iii) windowed temporal fusion using a specialised temporal module. We further update the model architecture to better preserve spatial structure. Evaluations on radar/sonar-to-LiDAR translation demonstrate improved temporal and spatial coherence, yielding lower absolute trajectory error and better occupancy map accuracy in Cartographer-based SLAM (Simultaneous Localisation and Mapping). We propose different metrics based on the Fr\'echet Video Motion Distance (FVMD) and a correlation-peak distance metric providing practical temporal quality indicators to evaluate SLAM performance. The proposed temporal LiDAR-BIND, or LiDAR-BIND-T, maintains modular modality fusion while substantially enhancing temporal stability, resulting in improved robustness and performance for downstream SLAM.
Related papers
- DVLO4D: Deep Visual-Lidar Odometry with Sparse Spatial-temporal Fusion [28.146811420532455]
We introduce DVLO4D, a novel visual-LiDAR odometry framework that leverages sparse spatial-temporal fusion to enhance accuracy and robustness.<n>Our method has high efficiency, with an inference time of 82 ms, possessing the potential for the real-time deployment.
arXiv Detail & Related papers (2025-09-07T11:43:11Z) - Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations [23.21118045286231]
LiMA is a novel framework that captures longer range temporal correlations to enhance LiDAR representation learning.<n>LiMA has high pretraining efficiency and incurs no additional computational overhead during downstream tasks.<n> experiments on mainstream LiDAR-based perception benchmarks demonstrate that LiMA significantly improves both LiDAR semantic segmentation and 3D object detection.
arXiv Detail & Related papers (2025-07-07T17:59:58Z) - SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining [62.433137130087445]
SuperFlow++ is a novel framework that integrates pretraining and downstream tasks using consecutive camera pairs.<n>We show that SuperFlow++ outperforms state-of-the-art methods across diverse tasks and driving conditions.<n>With strong generalizability and computational efficiency, SuperFlow++ establishes a new benchmark for data-efficient LiDAR-based perception in autonomous driving.
arXiv Detail & Related papers (2025-03-25T17:59:57Z) - Semantic-Supervised Spatial-Temporal Fusion for LiDAR-based 3D Object Detection [22.890432295751086]
LiDAR-based 3D object detection presents significant challenges due to the inherent sparsity of LiDAR points.<n>We propose a novel fusion module to relieve the spatial misalignment caused by the object motion over time.<n>We also propose a Semantic Injection method to enrich the sparse LiDAR data via injecting the point-wise semantic labels.
arXiv Detail & Related papers (2025-03-13T17:30:20Z) - Cross Space and Time: A Spatio-Temporal Unitized Model for Traffic Flow Forecasting [16.782154479264126]
Predicting backbone-temporal traffic flow presents challenges due to complex interactions between temporal factors.
Existing approaches address these dimensions in isolation, neglecting their critical interdependencies.
In this paper, we introduce Sanonymous-Temporal Unitized Unitized Cell (ASTUC), a unified framework designed to capture both spatial and temporal dependencies.
arXiv Detail & Related papers (2024-11-14T07:34:31Z) - LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [50.808933338389686]
We present LiDAR-GS, a real-time, high-fidelity re-simulation of LiDAR scans in public urban road scenes.<n>The method achieves state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z) - Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences [25.74000325019015]
We introduce a novel LiDAR 3D object detection framework, namely LiSTM, to facilitate spatial-temporal feature learning with cross-frame motion forecasting information.
We have conducted experiments on the aggregation and nuScenes datasets to demonstrate that the proposed framework achieves superior 3D detection performance.
arXiv Detail & Related papers (2024-09-06T16:29:04Z) - DSLO: Deep Sequence LiDAR Odometry Based on Inconsistent Spatio-temporal Propagation [66.8732965660931]
paper introduces a 3D point cloud sequence learning model based on inconsistent-temporal propagation for LiDAR odometry DSLO.
It consists of a pyramid structure with a sequential pose module, a hierarchical pose refinement module, and a temporal feature propagation module.
arXiv Detail & Related papers (2024-09-01T15:12:48Z) - Local-Global Temporal Difference Learning for Satellite Video Super-Resolution [53.03380679343968]
We propose to exploit the well-defined temporal difference for efficient and effective temporal compensation.<n>To fully utilize the local and global temporal information within frames, we systematically modeled the short-term and long-term temporal discrepancies.<n> Rigorous objective and subjective evaluations conducted across five mainstream video satellites demonstrate that our method performs favorably against state-of-the-art approaches.
arXiv Detail & Related papers (2023-04-10T07:04:40Z) - Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in
Driving Scenes [82.4186966781934]
We introduce a simple, efficient, and effective two-stage detector, termed as Ret3D.
At the core of Ret3D is the utilization of novel intra-frame and inter-frame relation modules.
With negligible extra overhead, Ret3D achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-08-18T03:48:58Z) - Robust Self-Supervised LiDAR Odometry via Representative Structure
Discovery and 3D Inherent Error Modeling [67.75095378830694]
We develop a two-stage odometry estimation network, where we obtain the ego-motion by estimating a set of sub-region transformations.
In this paper, we aim to alleviate the influence of unreliable structures in training, inference and mapping phases.
Our two-frame odometry outperforms the previous state of the arts by 16%/12% in terms of translational/rotational errors.
arXiv Detail & Related papers (2022-02-27T12:52:27Z) - Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based
Motion Recognition [62.46544616232238]
Previous motion recognition methods have achieved promising performance through the tightly coupled multi-temporal representation.
We propose to decouple and recouple caused caused representation for RGB-D-based motion recognition.
arXiv Detail & Related papers (2021-12-16T18:59:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.