Does Thermal data make the detection systems more reliable?
- URL: http://arxiv.org/abs/2111.05191v1
- Date: Tue, 9 Nov 2021 15:04:34 GMT
- Title: Does Thermal data make the detection systems more reliable?
- Authors: Shruthi Gowda, Bahram Zonooz, Elahe Arani
- Abstract summary: We propose a comprehensive detection system based on a multimodal-collaborative framework.
This framework learns from both RGB (from visual cameras) and thermal (from Infrared cameras) data.
Our empirical results show that while the improvement in accuracy is nominal, the value lies in challenging and extremely difficult edge cases.
- Score: 1.2891210250935146
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep learning-based detection networks have made remarkable progress in
autonomous driving systems (ADS). ADS should have reliable performance across a
variety of ambient lighting and adverse weather conditions. However, luminance
degradation and visual obstructions (such as glare, fog) result in poor quality
images by the visual camera which leads to performance decline. To overcome
these challenges, we explore the idea of leveraging a different data modality
that is disparate yet complementary to the visual data. We propose a
comprehensive detection system based on a multimodal-collaborative framework
that learns from both RGB (from visual cameras) and thermal (from Infrared
cameras) data. This framework trains two networks collaboratively and provides
flexibility in learning optimal features of its own modality while also
incorporating the complementary knowledge of the other. Our extensive empirical
results show that while the improvement in accuracy is nominal, the value lies
in challenging and extremely difficult edge cases which is crucial in
safety-critical applications such as AD. We provide a holistic view of both
merits and limitations of using a thermal imaging system in detection.
Related papers
- Light the Night: A Multi-Condition Diffusion Framework for Unpaired Low-Light Enhancement in Autonomous Driving [45.97279394690308]
LightDiff is a framework designed to enhance the low-light image quality for autonomous driving applications.
It incorporates a novel multi-condition adapter that adaptively controls the input weights from different modalities, including depth maps, RGB images, and text captions.
It can significantly improve the performance of several state-of-the-art 3D detectors in night-time conditions while achieving high visual quality scores.
arXiv Detail & Related papers (2024-04-07T04:10:06Z) - D-YOLO a robust framework for object detection in adverse weather conditions [0.0]
Adverse weather conditions including haze, snow and rain lead to decline in image qualities, which often causes a decline in performance for deep-learning based detection networks.
To better integrate image restoration and object detection tasks, we designed a double-route network with an attention feature fusion module.
We also proposed a subnetwork to provide haze-free features to the detection network. Specifically, our D-YOLO improves the performance of the detection network by minimizing the distance between the clear feature extraction subnetwork and detection network.
arXiv Detail & Related papers (2024-03-14T09:57:15Z) - Multi-Attention Fusion Drowsy Driving Detection Model [1.2043574473965317]
We introduce a novel approach called the Multi-Attention Fusion Drowsy Driving Detection Model (MAF)
Our proposed model achieves an impressive driver drowsiness detection accuracy of 96.8%.
arXiv Detail & Related papers (2023-12-28T14:53:32Z) - MISFIT-V: Misaligned Image Synthesis and Fusion using Information from
Thermal and Visual [2.812395851874055]
This work presents Misaligned Image Synthesis and Fusion using Information from Thermal and Visual (MISFIT-V)
It is a novel two-pronged unsupervised deep learning approach that utilizes a Generative Adversarial Network (GAN) and a cross-attention mechanism to capture the most relevant features from each modality.
Experimental results show MISFIT-V offers enhanced robustness against misalignment and poor lighting/thermal environmental conditions.
arXiv Detail & Related papers (2023-09-22T23:41:24Z) - Hybrid-Supervised Dual-Search: Leveraging Automatic Learning for
Loss-free Multi-Exposure Image Fusion [60.221404321514086]
Multi-exposure image fusion (MEF) has emerged as a prominent solution to address the limitations of digital imaging in representing varied exposure levels.
This paper presents a Hybrid-Supervised Dual-Search approach for MEF, dubbed HSDS-MEF, which introduces a bi-level optimization search scheme for automatic design of both network structures and loss functions.
arXiv Detail & Related papers (2023-09-03T08:07:26Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z) - A Synthesis-Based Approach for Thermal-to-Visible Face Verification [105.63410428506536]
This paper presents an algorithm that achieves state-of-the-art performance on the ARL-VTF and TUFTS multi-spectral face datasets.
We also present MILAB-VTF(B), a challenging multi-spectral face dataset composed of paired thermal and visible videos.
arXiv Detail & Related papers (2021-08-21T17:59:56Z) - Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision
Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos)
The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z) - Exploring Thermal Images for Object Detection in Underexposure Regions
for Autonomous Driving [67.69430435482127]
Underexposure regions are vital to construct a complete perception of the surroundings for safe autonomous driving.
The availability of thermal cameras has provided an essential alternate to explore regions where other optical sensors lack in capturing interpretable signals.
This work proposes a domain adaptation framework which employs a style transfer technique for transfer learning from visible spectrum images to thermal images.
arXiv Detail & Related papers (2020-06-01T09:59:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.