A High-Performance Thermal Infrared Object Detection Framework with Centralized Regulation
- URL: http://arxiv.org/abs/2505.10825v1
- Date: Fri, 16 May 2025 03:43:24 GMT
- Title: A High-Performance Thermal Infrared Object Detection Framework with Centralized Regulation
- Authors: Jinke Li, Yue Wu, Xiaoyan Yang,
- Abstract summary: We present a novel and efficient thermal object detection framework, known as CRT-YOLO, that is based on centralized feature regulation.<n>Our proposed model integrates efficient multi-scale infrared attention modules, which adeptly capture long-range infrared.<n>Experiments conducted on two benchmark datasets demonstrate that our CRT-YOLO model significantly outperforms conventional methods.
- Score: 5.935808994536907
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Thermal Infrared (TIR) technology involves the use of sensors to detect and measure infrared radiation emitted by objects, and it is widely utilized across a broad spectrum of applications. The advancements in object detection methods utilizing TIR images have sparked significant research interest. However, most traditional methods lack the capability to effectively extract and fuse local-global information, which is crucial for TIR-domain feature attention. In this study, we present a novel and efficient thermal infrared object detection framework, known as CRT-YOLO, that is based on centralized feature regulation, enabling the establishment of global-range interaction on TIR information. Our proposed model integrates efficient multi-scale attention (EMA) modules, which adeptly capture long-range dependencies while incurring minimal computational overhead. Additionally, it leverages the Centralized Feature Pyramid (CFP) network, which offers global regulation of TIR features. Extensive experiments conducted on two benchmark datasets demonstrate that our CRT-YOLO model significantly outperforms conventional methods for TIR image object detection. Furthermore, the ablation study provides compelling evidence of the effectiveness of our proposed modules, reinforcing the potential impact of our approach on advancing the field of thermal infrared object detection.
Related papers
- Probing Deep into Temporal Profile Makes the Infrared Small Target Detector Much Better [63.567886330598945]
Infrared small target (IRST) detection is challenging in simultaneously achieving precise, universal, robust and efficient performance.<n>Current learning-based methods attempt to leverage more" information from both the spatial and the short-term temporal domains.<n>We propose an efficient deep temporal probe network (DeepPro) that only performs calculations in the time dimension for IRST detection.
arXiv Detail & Related papers (2025-06-15T08:19:32Z) - Multispectral Detection Transformer with Infrared-Centric Feature Fusion [8.762314897895175]
Infrared-Centric Fusion (IC-Fusion) is a lightweight and modality-aware sensor fusion method.<n>IC-Fusion prioritizes infrared features while effectively integrating complementary RGB semantic context.<n> Experiments on the FLIR and LLVIP benchmarks demonstrate the superior effectiveness and efficiency of our IR-centric fusion strategy.
arXiv Detail & Related papers (2025-05-21T05:44:14Z) - Multi-Domain Biometric Recognition using Body Embeddings [51.36007967653781]
We show that body embeddings perform better than face embeddings in medium-wave infrared (MWIR) and long-wave infrared (LWIR) domains.<n>We leverage a vision transformer architecture to establish benchmark results on the IJB-MDF dataset.<n>We also show that finetuning a body model, pretrained exclusively on VIS data, with a simple combination of cross-entropy and triplet losses achieves state-of-the-art mAP scores.
arXiv Detail & Related papers (2025-03-13T22:38:18Z) - Bringing RGB and IR Together: Hierarchical Multi-Modal Enhancement for Robust Transmission Line Detection [67.02804741856512]
We propose a novel Hierarchical Multi-Modal Enhancement Network (HMMEN) that integrates RGB and IR data for robust and accurate TL detection.<n>Our method introduces two key components: (1) a Mutual Multi-Modal Enhanced Block (MMEB), which fuses and enhances hierarchical RGB and IR feature maps in a coarse-to-fine manner, and (2) a Feature Alignment Block (FAB) that corrects misalignments between decoder outputs and IR feature maps by leveraging deformable convolutions.
arXiv Detail & Related papers (2025-01-25T06:21:06Z) - Removal then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object Detection [20.12812979315803]
Object detection utilizing both visible (RGB) and thermal infrared (IR) imagery has garnered extensive attention.
Most existing multi-modal object detection methods directly input the RGB and IR images into deep neural networks.
We propose a novel coarse-to-fine perspective to purify and fuse features from both modalities.
arXiv Detail & Related papers (2024-01-19T14:49:42Z) - Robust Environment Perception for Automated Driving: A Unified Learning
Pipeline for Visual-Infrared Object Detection [2.478658210785]
We exploit both visual and thermal perception units for robust object detection purposes.
In this paper, we exploit both visual and thermal perception units for robust object detection purposes.
arXiv Detail & Related papers (2022-06-08T15:02:58Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z) - Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection using
Meta-Learning [64.92447072894055]
Infrared (IR) cameras are robust under adverse illumination and lighting conditions.
We propose an algorithm meta-learning framework to improve existing UDA methods.
We produce a state-of-the-art thermal detector for the KAIST and DSIAC datasets.
arXiv Detail & Related papers (2021-10-07T02:28:18Z) - Infrared Small-Dim Target Detection with Transformer under Complex
Backgrounds [155.388487263872]
We propose a new infrared small-dim target detection method with the transformer.
We adopt the self-attention mechanism of the transformer to learn the interaction information of image features in a larger range.
We also design a feature enhancement module to learn more features of small-dim targets.
arXiv Detail & Related papers (2021-09-29T12:23:41Z) - Exploring Thermal Images for Object Detection in Underexposure Regions
for Autonomous Driving [67.69430435482127]
Underexposure regions are vital to construct a complete perception of the surroundings for safe autonomous driving.
The availability of thermal cameras has provided an essential alternate to explore regions where other optical sensors lack in capturing interpretable signals.
This work proposes a domain adaptation framework which employs a style transfer technique for transfer learning from visible spectrum images to thermal images.
arXiv Detail & Related papers (2020-06-01T09:59:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.