Robust Environment Perception for Automated Driving: A Unified Learning
Pipeline for Visual-Infrared Object Detection
- URL: http://arxiv.org/abs/2206.03943v1
- Date: Wed, 8 Jun 2022 15:02:58 GMT
- Title: Robust Environment Perception for Automated Driving: A Unified Learning
Pipeline for Visual-Infrared Object Detection
- Authors: Mohsen Vadidar, Ali Kariminezhad, Christian Mayr, Laurent Kloeker and
Lutz Eckstein
- Abstract summary: We exploit both visual and thermal perception units for robust object detection purposes.
In this paper, we exploit both visual and thermal perception units for robust object detection purposes.
- Score: 2.478658210785
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The RGB complementary metal-oxidesemiconductor (CMOS) sensor works within the
visible light spectrum. Therefore it is very sensitive to environmental light
conditions. On the contrary, a long-wave infrared (LWIR) sensor operating in
8-14 micro meter spectral band, functions independent of visible light.
In this paper, we exploit both visual and thermal perception units for robust
object detection purposes. After delicate synchronization and (cross-) labeling
of the FLIR [1] dataset, this multi-modal perception data passes through a
convolutional neural network (CNN) to detect three critical objects on the
road, namely pedestrians, bicycles, and cars. After evaluation of RGB and
infrared (thermal and infrared are often used interchangeably) sensors
separately, various network structures are compared to fuse the data at the
feature level effectively. Our RGB-thermal (RGBT) fusion network, which takes
advantage of a novel entropy-block attention module (EBAM), outperforms the
state-of-the-art network [2] by 10% with 82.9% mAP.
Related papers
- Infra-YOLO: Efficient Neural Network Structure with Model Compression for Real-Time Infrared Small Object Detection [4.586010474241955]
A new dataset named InfraTiny was constructed, and more than 85% bounding box is less than 32x32 pixels (3218 images and a total of 20,893 bounding boxes)
A multi-scale attention mechanism module (MSAM) and a Feature Fusion Augmentation Pyramid Module (FFAFPM) were proposed and deployed onto embedded devices.
By integrating the proposed methods into the YOLO model, which is named Infra-YOLO, infrared small object detection performance has been improved.
arXiv Detail & Related papers (2024-08-14T10:49:14Z) - IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection [55.554484379021524]
Infrared Small Target Detection (IRSTD) task falls short in achieving satisfying performance due to a notable domain gap between natural and infrared images.
We propose the IRSAM model for IRSTD, which improves SAM's encoder-decoder architecture to learn better feature representation of infrared small objects.
arXiv Detail & Related papers (2024-07-10T10:17:57Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - Removal then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object Detection [20.12812979315803]
Object detection utilizing both visible (RGB) and thermal infrared (IR) imagery has garnered extensive attention.
Most existing multi-modal object detection methods directly input the RGB and IR images into deep neural networks.
We propose a novel coarse-to-fine perspective to purify and fuse features from both modalities.
arXiv Detail & Related papers (2024-01-19T14:49:42Z) - Interactive Context-Aware Network for RGB-T Salient Object Detection [7.544240329265388]
We propose a novel network called Interactive Context-Aware Network (ICANet)
ICANet contains three modules that can effectively perform the cross-modal and cross-scale fusions.
Experiments prove that our network performs favorably against the state-of-the-art RGB-T SOD methods.
arXiv Detail & Related papers (2022-11-11T10:04:36Z) - Learning Enriched Illuminants for Cross and Single Sensor Color
Constancy [182.4997117953705]
We propose cross-sensor self-supervised training to train the network.
We train the network by randomly sampling the artificial illuminants in a sensor-independent manner.
Experiments show that our cross-sensor model and single-sensor model outperform other state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2022-03-21T15:45:35Z) - Infrared Small-Dim Target Detection with Transformer under Complex
Backgrounds [155.388487263872]
We propose a new infrared small-dim target detection method with the transformer.
We adopt the self-attention mechanism of the transformer to learn the interaction information of image features in a larger range.
We also design a feature enhancement module to learn more features of small-dim targets.
arXiv Detail & Related papers (2021-09-29T12:23:41Z) - SSTN: Self-Supervised Domain Adaptation Thermal Object Detection for
Autonomous Driving [6.810856082577402]
We have proposed a deep neural network Self Supervised Thermal Network (SSTN) for learning the feature embedding to maximize the information between visible and infrared spectrum domain by contrastive learning.
The proposed method is extensively evaluated on the two publicly available datasets: the FLIR-ADAS dataset and the KAIST Multi-Spectral dataset.
arXiv Detail & Related papers (2021-03-04T16:42:49Z) - Exploring Thermal Images for Object Detection in Underexposure Regions
for Autonomous Driving [67.69430435482127]
Underexposure regions are vital to construct a complete perception of the surroundings for safe autonomous driving.
The availability of thermal cameras has provided an essential alternate to explore regions where other optical sensors lack in capturing interpretable signals.
This work proposes a domain adaptation framework which employs a style transfer technique for transfer learning from visible spectrum images to thermal images.
arXiv Detail & Related papers (2020-06-01T09:59:09Z) - Drone-based RGB-Infrared Cross-Modality Vehicle Detection via
Uncertainty-Aware Learning [59.19469551774703]
Drone-based vehicle detection aims at finding the vehicle locations and categories in an aerial image.
We construct a large-scale drone-based RGB-Infrared vehicle detection dataset, termed DroneVehicle.
Our DroneVehicle collects 28, 439 RGB-Infrared image pairs, covering urban roads, residential areas, parking lots, and other scenarios from day to night.
arXiv Detail & Related papers (2020-03-05T05:29:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.