Multispectral Object Detection with Deep Learning
- URL: http://arxiv.org/abs/2102.03115v1
- Date: Fri, 5 Feb 2021 11:39:14 GMT
- Title: Multispectral Object Detection with Deep Learning
- Authors: Md Osman Gani, Somenath Kuiry, Alaka Das, Mita Nasipuri, Nibaran Das
- Abstract summary: In this work, we have taken images with both the Thermal and NIR spectrum for the object detection task.
We train the YOLO v3 network from scratch to detect an object from multi-spectral images.
- Score: 7.592218846348004
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Object detection in natural scenes can be a challenging task. In many
real-life situations, the visible spectrum is not suitable for traditional
computer vision tasks. Moving outside the visible spectrum range, such as the
thermal spectrum or the near-infrared (NIR) images, is much more beneficial in
low visibility conditions, NIR images are very helpful for understanding the
object's material quality. In this work, we have taken images with both the
Thermal and NIR spectrum for the object detection task. As multi-spectral data
with both Thermal and NIR is not available for the detection task, we needed to
collect data ourselves. Data collection is a time-consuming process, and we
faced many obstacles that we had to overcome. We train the YOLO v3 network from
scratch to detect an object from multi-spectral images. Also, to avoid
overfitting, we have done data augmentation and tune hyperparameters.
Related papers
- BihoT: A Large-Scale Dataset and Benchmark for Hyperspectral Camouflaged Object Tracking [22.533682363532403]
We provide a new task called hyperspectral camouflaged object tracking (HCOT)
We meticulously construct a large-scale HCOT dataset, termed BihoT, which consists of 41,912 hyperspectral images covering 49 video sequences.
A simple but effective baseline model, named spectral prompt-based distractor-aware network (SPDAN), is proposed.
arXiv Detail & Related papers (2024-08-22T09:07:51Z) - Removal then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object Detection [20.12812979315803]
Object detection utilizing both visible (RGB) and thermal infrared (IR) imagery has garnered extensive attention.
Most existing multi-modal object detection methods directly input the RGB and IR images into deep neural networks.
We propose a novel coarse-to-fine perspective to purify and fuse features from both modalities.
arXiv Detail & Related papers (2024-01-19T14:49:42Z) - Enhancing Low-Light Images Using Infrared-Encoded Images [81.8710581927427]
Previous arts mainly focus on the low-light images captured in the visible spectrum using pixel-wise loss.
We propose a novel approach to increase the visibility of images captured under low-light environments by removing the in-camera infrared (IR) cut-off filter.
arXiv Detail & Related papers (2023-07-09T08:29:19Z) - Object Detection in Hyperspectral Image via Unified Spectral-Spatial
Feature Aggregation [55.9217962930169]
We present S2ADet, an object detector that harnesses the rich spectral and spatial complementary information inherent in hyperspectral images.
S2ADet surpasses existing state-of-the-art methods, achieving robust and reliable results.
arXiv Detail & Related papers (2023-06-14T09:01:50Z) - Fast Fourier Convolution Based Remote Sensor Image Object Detection for
Earth Observation [0.0]
We propose a Frequency-aware Feature Pyramid Framework (FFPF) for remote sensing object detection.
F-ResNet is proposed to perceive the spectral context information by plugging the frequency domain convolution into each stage of the backbone.
The BSFPN is designed to use a bilateral sampling strategy and skipping connection to better model the association of object features at different scales.
arXiv Detail & Related papers (2022-09-01T15:50:58Z) - Multitask AET with Orthogonal Tangent Regularity for Dark Object
Detection [84.52197307286681]
We propose a novel multitask auto encoding transformation (MAET) model to enhance object detection in a dark environment.
In a self-supervision manner, the MAET learns the intrinsic visual structure by encoding and decoding the realistic illumination-degrading transformation.
We have achieved the state-of-the-art performance using synthetic and real-world datasets.
arXiv Detail & Related papers (2022-05-06T16:27:14Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - NeRF-Supervision: Learning Dense Object Descriptors from Neural Radiance
Fields [54.27264716713327]
We show that a Neural Radiance Fields (NeRF) representation of a scene can be used to train dense object descriptors.
We use an optimized NeRF to extract dense correspondences between multiple views of an object, and then use these correspondences as training data for learning a view-invariant representation of the object.
Dense correspondence models supervised with our method significantly outperform off-the-shelf learned descriptors by 106%.
arXiv Detail & Related papers (2022-03-03T18:49:57Z) - Drone Object Detection Using RGB/IR Fusion [1.5469452301122175]
We develop strategies for creating synthetic IR images using the AIRSim simulation engine and CycleGAN.
We utilize an illumination-aware fusion framework to fuse RGB and IR images for object detection on the ground.
Our solution is implemented on an NVIDIA Jetson Xavier running on an actual drone, requiring about 28 milliseconds of processing per RGB/IR image pair.
arXiv Detail & Related papers (2022-01-11T05:15:59Z) - A Comparison of Deep Saliency Map Generators on Multispectral Data in
Object Detection [9.264502124445348]
This work investigates three saliency map generator methods on how their maps differ across the different spectra.
As a practical problem, we chose object detection in the infrared and visual spectrum for autonomous driving.
The results show that there are differences between the infrared and visual activation maps.
Further, an advanced training with both, the infrared and visual data not only improves the network's output, it also leads to more focused spots in the saliency maps.
arXiv Detail & Related papers (2021-08-26T12:56:49Z) - Exploring Thermal Images for Object Detection in Underexposure Regions
for Autonomous Driving [67.69430435482127]
Underexposure regions are vital to construct a complete perception of the surroundings for safe autonomous driving.
The availability of thermal cameras has provided an essential alternate to explore regions where other optical sensors lack in capturing interpretable signals.
This work proposes a domain adaptation framework which employs a style transfer technique for transfer learning from visible spectrum images to thermal images.
arXiv Detail & Related papers (2020-06-01T09:59:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.