ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation
- URL: http://arxiv.org/abs/2506.20969v1
- Date: Thu, 26 Jun 2025 03:18:22 GMT
- Title: ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation
- Authors: Shruti Bansal, Wenshan Wang, Yifei Liu, Parv Maheshwari,
- Abstract summary: We propose a solution to augment multi-modal datasets with synthetic thermal data to enable widespread and rapid adaptation of thermal cameras.<n>We explore the use of conditional diffusion models to convert existing RGB images to thermal images using self-attention to learn the thermal properties of real-world objects.
- Score: 6.524847658755803
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous systems rely on sensors to estimate the environment around them. However, cameras, LiDARs, and RADARs have their own limitations. In nighttime or degraded environments such as fog, mist, or dust, thermal cameras can provide valuable information regarding the presence of objects of interest due to their heat signature. They make it easy to identify humans and vehicles that are usually at higher temperatures compared to their surroundings. In this paper, we focus on the adaptation of thermal cameras for robotics and automation, where the biggest hurdle is the lack of data. Several multi-modal datasets are available for driving robotics research in tasks such as scene segmentation, object detection, and depth estimation, which are the cornerstone of autonomous systems. However, they are found to be lacking in thermal imagery. Our paper proposes a solution to augment these datasets with synthetic thermal data to enable widespread and rapid adaptation of thermal cameras. We explore the use of conditional diffusion models to convert existing RGB images to thermal images using self-attention to learn the thermal properties of real-world objects.
Related papers
- Dataset and Benchmark: Novel Sensors for Autonomous Vehicle Perception [7.474695739346621]
This paper introduces the Novel Sensors for Autonomous Vehicle Perception dataset to facilitate future research on this topic.
The data was collected by repeatedly driving two 8 km routes and includes varied lighting conditions and opposing viewpoint perspectives.
To our knowledge, the NSAVP dataset is the first to include stereo thermal cameras together with stereo event and monochrome cameras.
arXiv Detail & Related papers (2024-01-24T23:25:23Z) - MIPI 2022 Challenge on RGB+ToF Depth Completion: Dataset and Report [92.61915017739895]
This paper introduces the first MIPI challenge including five tracks focusing on novel image sensors and imaging algorithms.
The participants were provided with a new dataset called TetrasRGBD, which contains 18k pairs of high-quality synthetic RGB+Depth training data and 2.3k pairs of testing data from mixed sources.
The final results are evaluated using objective metrics and Mean Opinion Score (MOS) subjectively.
arXiv Detail & Related papers (2022-09-15T05:31:53Z) - Drone Detection and Tracking in Real-Time by Fusion of Different Sensing
Modalities [66.4525391417921]
We design and evaluate a multi-sensor drone detection system.
Our solution integrates a fish-eye camera as well to monitor a wider part of the sky and steer the other cameras towards objects of interest.
The thermal camera is shown to be a feasible solution as good as the video camera, even if the camera employed here has a lower resolution.
arXiv Detail & Related papers (2022-07-05T10:00:58Z) - Machine Learning-Based Automated Thermal Comfort Prediction: Integration
of Low-Cost Thermal and Visual Cameras for Higher Accuracy [3.2872586139884623]
Real-time feedback system is needed to provide data about occupants' comfort conditions.
New solutions are required to bring a more holistic view toward non-intrusive thermal scanning.
arXiv Detail & Related papers (2022-04-14T15:30:16Z) - Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection using
Meta-Learning [64.92447072894055]
Infrared (IR) cameras are robust under adverse illumination and lighting conditions.
We propose an algorithm meta-learning framework to improve existing UDA methods.
We produce a state-of-the-art thermal detector for the KAIST and DSIAC datasets.
arXiv Detail & Related papers (2021-10-07T02:28:18Z) - Lidar Light Scattering Augmentation (LISA): Physics-based Simulation of
Adverse Weather Conditions for 3D Object Detection [60.89616629421904]
Lidar-based object detectors are critical parts of the 3D perception pipeline in autonomous navigation systems such as self-driving cars.
They are sensitive to adverse weather conditions such as rain, snow and fog due to reduced signal-to-noise ratio (SNR) and signal-to-background ratio (SBR)
arXiv Detail & Related papers (2021-07-14T21:10:47Z) - The Use of AI for Thermal Emotion Recognition: A Review of Problems and
Limitations in Standard Design and Data [36.33347149799959]
With the increased attention on thermal imagery for Covid-19 screening, the public sector may believe there are new opportunities to exploit thermal as a modality for computer vision and AI.
This paper takes the reader on a short review of machine learning in thermal FER and the limitations of collecting and developing thermal FER data for AI training.
arXiv Detail & Related papers (2020-09-22T14:58:59Z) - Exploring Thermal Images for Object Detection in Underexposure Regions
for Autonomous Driving [67.69430435482127]
Underexposure regions are vital to construct a complete perception of the surroundings for safe autonomous driving.
The availability of thermal cameras has provided an essential alternate to explore regions where other optical sensors lack in capturing interpretable signals.
This work proposes a domain adaptation framework which employs a style transfer technique for transfer learning from visible spectrum images to thermal images.
arXiv Detail & Related papers (2020-06-01T09:59:09Z) - Learning Camera Miscalibration Detection [83.38916296044394]
This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras.
Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric.
By training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.
arXiv Detail & Related papers (2020-05-24T10:32:49Z) - SurfelGAN: Synthesizing Realistic Sensor Data for Autonomous Driving [27.948417322786575]
We present a simple yet effective approach to generate realistic scenario sensor data.
Our approach uses texture-mapped surfels to efficiently reconstruct the scene from an initial vehicle pass or set of passes.
We then leverage a SurfelGAN network to reconstruct realistic camera images for novel positions and orientations of the self-driving vehicle.
arXiv Detail & Related papers (2020-05-08T04:01:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.