TFDet: Target-Aware Fusion for RGB-T Pedestrian Detection
- URL: http://arxiv.org/abs/2305.16580v3
- Date: Wed, 18 Oct 2023 01:45:06 GMT
- Title: TFDet: Target-Aware Fusion for RGB-T Pedestrian Detection
- Authors: Xue Zhang, Xiao-Han Zhang, Jiacheng Ying, Zehua Sheng, Heng Yu,
Chunguang Li, Hui-Liang Shen
- Abstract summary: We propose a novel target-aware fusion strategy for multispectral pedestrian detection, named TFDet.
Our fusion strategy highlights the pedestrian-related features and suppresses unrelated ones, generating more discriminative fused features. TFDet achieves state-of-the-art performance on both KAIST and LLVIP benchmarks, with an efficiency comparable to the previous state-of-the-art counterpart.
- Score: 21.502127701404792
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Pedestrian detection plays a critical role in computer vision as it
contributes to ensuring traffic safety. Existing methods that rely solely on
RGB images suffer from performance degradation under low-light conditions due
to the lack of useful information. To address this issue, recent multispectral
detection approaches have combined thermal images to provide complementary
information and have obtained enhanced performances. Nevertheless, few
approaches focus on the negative effects of false positives caused by noisy
fused feature maps. Different from them, we comprehensively analyze the impacts
of false positives on the detection performance and find that enhancing feature
contrast can significantly reduce these false positives. In this paper, we
propose a novel target-aware fusion strategy for multispectral pedestrian
detection, named TFDet. Our fusion strategy highlights the pedestrian-related
features and suppresses unrelated ones, generating more discriminative fused
features. TFDet achieves state-of-the-art performance on both KAIST and LLVIP
benchmarks, with an efficiency comparable to the previous state-of-the-art
counterpart. Importantly, TFDet performs remarkably well even under low-light
conditions, which is a significant advancement for ensuring road safety. The
code will be made publicly available at https://github.com/XueZ-phd/TFDet.git.
Related papers
- Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection [4.269334070603315]
We propose a realistic-like Robust Black-box Adrial attack (R$2$BA) with post-processing fusion optimization.
We show that R$2$BA exhibits impressive anti-detection performance, excellent invisibility, and strong robustness in GAN-based and diffusion-based cases.
arXiv Detail & Related papers (2024-12-09T18:16:50Z) - Optimizing Multispectral Object Detection: A Bag of Tricks and Comprehensive Benchmarks [49.84182981950623]
Multispectral object detection, utilizing RGB and TIR (thermal infrared) modalities, is widely recognized as a challenging task.
It requires not only the effective extraction of features from both modalities and robust fusion strategies, but also the ability to address issues such as spectral discrepancies.
We introduce an efficient and easily deployable multispectral object detection framework that can seamlessly optimize high-performing single-modality models.
arXiv Detail & Related papers (2024-11-27T12:18:39Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - ReDFeat: Recoupling Detection and Description for Multimodal Feature
Learning [51.07496081296863]
We recouple independent constraints of detection and description of multimodal feature learning with a mutual weighting strategy.
We propose a detector that possesses a large receptive field and is equipped with learnable non-maximum suppression layers.
We build a benchmark that contains cross visible, infrared, near-infrared and synthetic aperture radar image pairs for evaluating the performance of features in feature matching and image registration tasks.
arXiv Detail & Related papers (2022-05-16T04:24:22Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z) - Benchmarking Deep Models for Salient Object Detection [67.07247772280212]
We construct a general SALient Object Detection (SALOD) benchmark to conduct a comprehensive comparison among several representative SOD methods.
In the above experiments, we find that existing loss functions usually specialized in some metrics but reported inferior results on the others.
We propose a novel Edge-Aware (EA) loss that promotes deep networks to learn more discriminative features by integrating both pixel- and image-level supervision signals.
arXiv Detail & Related papers (2022-02-07T03:43:16Z) - Illumination and Temperature-Aware Multispectral Networks for
Edge-Computing-Enabled Pedestrian Detection [10.454696553567809]
This study proposes a lightweight Illumination and Temperature-aware Multispectral Network (IT-MN) for accurate and efficient pedestrian detection.
The proposed algorithm is evaluated by comparing with the selected state-of-the-art algorithms using a public dataset collected by in-vehicle cameras.
The results show that the proposed algorithm achieves a low miss rate and inference time at 14.19% and 0.03 seconds per image pair on GPU.
arXiv Detail & Related papers (2021-12-09T17:27:23Z) - Multimodal Object Detection via Bayesian Fusion [59.31437166291557]
We study multimodal object detection with RGB and thermal cameras, since the latter can provide much stronger object signatures under poor illumination.
Our key contribution is a non-learned late-fusion method that fuses together bounding box detections from different modalities.
We apply our approach to benchmarks containing both aligned (KAIST) and unaligned (FLIR) multimodal sensor data.
arXiv Detail & Related papers (2021-04-07T04:03:20Z) - Robust Correlation Tracking via Multi-channel Fused Features and
Reliable Response Map [10.079856376445598]
This paper proposes a robust correlation tracking algorithm (RCT) based on two ideas.
First, we propose a method to fuse features in order to more naturally describe the gradient and color information of the tracked object.
Second, we present a novel strategy to significantly reduce noise in the response map and therefore ease the problem of model drift.
arXiv Detail & Related papers (2020-11-25T07:15:03Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.