Related papers: Multimodal Signal Processing For Thermo-Visible-Lidar Fusion In Real-time 3D Semantic Mapping

Multimodal Signal Processing For Thermo-Visible-Lidar Fusion In Real-time 3D Semantic Mapping

URL: http://arxiv.org/abs/2601.09578v1
Date: Wed, 14 Jan 2026 15:46:57 GMT
Title: Multimodal Signal Processing For Thermo-Visible-Lidar Fusion In Real-time 3D Semantic Mapping
Authors: Jiajun Sun, Yangyi Ou, Haoyuan Zheng, Chao yang, Yue Ma,
Abstract summary: This paper presents a novel method for semantically enhancing 3D point cloud maps with thermal information.<n>The system projects real-time LiDAR point clouds onto this fused image stream.<n>It then segments heat source features in the thermal channel to instantly identify high temperature targets and applies this temperature information as a semantic layer on the final 3D map.
Score: 8.401699100150866
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In complex environments, autonomous robot navigation and environmental perception pose higher requirements for SLAM technology. This paper presents a novel method for semantically enhancing 3D point cloud maps with thermal information. By first performing pixel-level fusion of visible and infrared images, the system projects real-time LiDAR point clouds onto this fused image stream. It then segments heat source features in the thermal channel to instantly identify high temperature targets and applies this temperature information as a semantic layer on the final 3D map. This approach generates maps that not only have accurate geometry but also possess a critical semantic understanding of the environment, making it highly valuable for specific applications like rapid disaster assessment and industrial preventive maintenance.

Related papers

Fast 3D Surrogate Modeling for Data Center Thermal Management [15.644716872105002]
Traditional thermal CFD solvers are computationally expensive and require expert-crafted meshes and boundary conditions.<n>We develop a vision-based surrogate modeling framework that operates directly on a 3D voxelized representation of the data center.<n>Our results show that the surrogate models generalize across data center configurations and achieve up to 20,000x speedup.
arXiv Detail & Related papers (2025-11-13T02:12:24Z)
ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation [6.524847658755803]
We propose a solution to augment multi-modal datasets with synthetic thermal data to enable widespread and rapid adaptation of thermal cameras.<n>We explore the use of conditional diffusion models to convert existing RGB images to thermal images using self-attention to learn the thermal properties of real-world objects.
arXiv Detail & Related papers (2025-06-26T03:18:22Z)
ThermoStereoRT: Thermal Stereo Matching in Real Time via Knowledge Distillation and Attention-based Refinement [9.923805440410739]
We introduce ThermoStereoRT, a real-time thermal stereo matching method.<n>It recovers disparity from two rectified thermal stereo images.<n>We envision applications such as night-time drone surveillance or under-bed cleaning robots.
arXiv Detail & Related papers (2025-04-10T03:24:21Z)
Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis [11.793425521298488]
This paper introduces a physics-induced 3D Gaussian splatting method named Thermal3D-GS. The first large-scale benchmark dataset for this field named Thermal Infrared Novel-view Synthesis dataset (TI-NSD) is created. The results indicate that our method outperforms the baseline method with a 3.03 dB improvement in PSNR.
arXiv Detail & Related papers (2024-09-12T13:46:53Z)
SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences. It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping. Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z)
Spatiotemporally Consistent HDR Indoor Lighting Estimation [66.26786775252592]
We propose a physically-motivated deep learning framework to solve the indoor lighting estimation problem. Given a single LDR image with a depth map, our method predicts spatially consistent lighting at any given image position. Our framework achieves photorealistic lighting prediction with higher quality compared to state-of-the-art single-image or video-based methods.
arXiv Detail & Related papers (2023-05-07T20:36:29Z)
DETR4D: Direct Multi-View 3D Object Detection with Sparse Attention [50.11672196146829]
3D object detection with surround-view images is an essential task for autonomous driving. We propose DETR4D, a Transformer-based framework that explores sparse attention and direct feature query for 3D object detection in multi-view images.
arXiv Detail & Related papers (2022-12-15T14:18:47Z)
Does Thermal Really Always Matter for RGB-T Salient Object Detection? [153.17156598262656]
This paper proposes a network named TNet to solve the RGB-T salient object detection (SOD) task. In this paper, we introduce a global illumination estimation module to predict the global illuminance score of the image. On the other hand, we introduce a two-stage localization and complementation module in the decoding phase to transfer object localization cue and internal integrity cue in thermal features to the RGB modality.
arXiv Detail & Related papers (2022-10-09T13:50:12Z)
Real-Time Multi-Modal Semantic Fusion on Unmanned Aerial Vehicles [28.504921333436837]
We propose a UAV system for real-time semantic inference and fusion of multiple sensor modalities. Semantic segmentation of LiDAR scans and RGB images, as well as object detection on RGB and thermal images, run online onboard the UAV computer. We evaluate the integrated system in real-world experiments in an urban environment.
arXiv Detail & Related papers (2021-08-14T20:16:08Z)
Cross-Modality 3D Object Detection [63.29935886648709]
We present a novel two-stage multi-modal fusion network for 3D object detection. The whole architecture facilitates two-stage fusion. Our experiments on the KITTI dataset show that the proposed multi-stage fusion helps the network to learn better representations.
arXiv Detail & Related papers (2020-08-16T11:01:20Z)
Exploring Thermal Images for Object Detection in Underexposure Regions for Autonomous Driving [67.69430435482127]
Underexposure regions are vital to construct a complete perception of the surroundings for safe autonomous driving. The availability of thermal cameras has provided an essential alternate to explore regions where other optical sensors lack in capturing interpretable signals. This work proposes a domain adaptation framework which employs a style transfer technique for transfer learning from visible spectrum images to thermal images.
arXiv Detail & Related papers (2020-06-01T09:59:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.