Thermal-Infrared Remote Target Detection System for Maritime Rescue
based on Data Augmentation with 3D Synthetic Data
- URL: http://arxiv.org/abs/2310.20412v1
- Date: Tue, 31 Oct 2023 12:37:49 GMT
- Title: Thermal-Infrared Remote Target Detection System for Maritime Rescue
based on Data Augmentation with 3D Synthetic Data
- Authors: Sungjin Cheong, Wonho Jung, Yoon Seop Lim, Yong-Hwa Park
- Abstract summary: This paper proposes a thermal-infrared (TIR) remote target detection system for maritime rescue using deep learning and data augmentation.
To address dataset scarcity and improve model robustness, a synthetic dataset from a 3D game (ARMA3) to augment the data is collected.
The proposed segmentation model surpasses the performance of state-of-the-art segmentation methods.
- Score: 4.66313002591741
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper proposes a thermal-infrared (TIR) remote target detection system
for maritime rescue using deep learning and data augmentation. We established a
self-collected TIR dataset consisting of multiple scenes imitating human rescue
situations using a TIR camera (FLIR). Additionally, to address dataset scarcity
and improve model robustness, a synthetic dataset from a 3D game (ARMA3) to
augment the data is further collected. However, a significant domain gap exists
between synthetic TIR and real TIR images. Hence, a proper domain adaptation
algorithm is essential to overcome the gap. Therefore, we suggest a domain
adaptation algorithm in a target-background separated manner from 3D
game-to-real, based on a generative model, to address this issue. Furthermore,
a segmentation network with fixed-weight kernels at the head is proposed to
improve the signal-to-noise ratio (SNR) and provide weak attention, as remote
TIR targets inherently suffer from unclear boundaries. Experiment results
reveal that the network trained on augmented data consisting of translated
synthetic and real TIR data outperforms that trained on only real TIR data by a
large margin. Furthermore, the proposed segmentation model surpasses the
performance of state-of-the-art segmentation methods.
Related papers
- IRASNet: Improved Feature-Level Clutter Reduction for Domain Generalized SAR-ATR [8.857297839399193]
This study proposes a framework particularly designed for domain-generalized SAR-ATR called IRASNet.
IRASNet enables effective feature-level clutter reduction and domain-invariant feature learning.
IRASNet not only enhances performance but also significantly improves feature-level clutter reduction, making it a valuable advancement in the field of radar image pattern recognition.
arXiv Detail & Related papers (2024-09-25T11:53:58Z) - Progressive Domain Adaptation for Thermal Infrared Object Tracking [9.888266596236578]
In this work, we propose a Progressive Domain Adaptation framework for TIR Tracking.
The framework makes full use of large-scale labeled RGB datasets without requiring time-consuming and labor-intensive labeling of large-scale TIR data.
Experimental results on five TIR tracking benchmarks show that the proposed method gains a nearly 6% success rate, demonstrating its effectiveness.
arXiv Detail & Related papers (2024-07-28T08:43:16Z) - Domain-Transferred Synthetic Data Generation for Improving Monocular Depth Estimation [9.812476193015488]
We propose a method of data generation in simulation using 3D synthetic environments and CycleGAN domain transfer.
We compare this method of data generation to the popular NYUDepth V2 dataset by training a depth estimation model based on the DenseDepth structure using different training sets of real and simulated data.
We evaluate the performance of the models on newly collected images and LiDAR depth data from a Husky robot to verify the generalizability of the approach and show that GAN-transformed data can serve as an effective alternative to real-world data, particularly in depth estimation.
arXiv Detail & Related papers (2024-05-02T09:21:10Z) - IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images [50.4538089115248]
Generalizable 3D object reconstruction from single-view RGB-D images remains a challenging task.
We propose a novel approach, IPoD, which harmonizes implicit field learning with point diffusion.
Experiments conducted on the CO3D-v2 dataset affirm the superiority of IPoD, achieving 7.8% improvement in F-score and 28.6% in Chamfer distance over existing methods.
arXiv Detail & Related papers (2024-03-30T07:17:37Z) - Cross-Cluster Shifting for Efficient and Effective 3D Object Detection
in Autonomous Driving [69.20604395205248]
We present a new 3D point-based detector model, named Shift-SSD, for precise 3D object detection in autonomous driving.
We introduce an intriguing Cross-Cluster Shifting operation to unleash the representation capacity of the point-based detector.
We conduct extensive experiments on the KITTI, runtime, and nuScenes datasets, and the results demonstrate the state-of-the-art performance of Shift-SSD.
arXiv Detail & Related papers (2024-03-10T10:36:32Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - Point-aware Interaction and CNN-induced Refinement Network for RGB-D
Salient Object Detection [95.84616822805664]
We introduce CNNs-assisted Transformer architecture and propose a novel RGB-D SOD network with Point-aware Interaction and CNN-induced Refinement.
In order to alleviate the block effect and detail destruction problems brought by the Transformer naturally, we design a CNN-induced refinement (CNNR) unit for content refinement and supplementation.
arXiv Detail & Related papers (2023-08-17T11:57:49Z) - ChiNet: Deep Recurrent Convolutional Learning for Multimodal Spacecraft
Pose Estimation [3.964047152162558]
This paper presents an innovative deep learning pipeline which estimates the relative pose of a spacecraft by incorporating the temporal information from a rendezvous sequence.
It leverages the performance of long short-term memory (LSTM) units in modelling sequences of data for the processing of features extracted by a convolutional neural network (CNN) backbone.
Three distinct training strategies, which follow a coarse-to-fine funnelled approach, are combined to facilitate feature learning and improve end-to-end pose estimation by regression.
arXiv Detail & Related papers (2021-08-23T16:48:58Z) - Deep Cellular Recurrent Network for Efficient Analysis of Time-Series
Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information.
The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z) - ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework
for LiDAR Point Cloud Segmentation [111.56730703473411]
Training deep neural networks (DNNs) on LiDAR data requires large-scale point-wise annotations.
Simulation-to-real domain adaptation (SRDA) trains a DNN using unlimited synthetic data with automatically generated labels.
ePointDA consists of three modules: self-supervised dropout noise rendering, statistics-invariant and spatially-adaptive feature alignment, and transferable segmentation learning.
arXiv Detail & Related papers (2020-09-07T23:46:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.