DCFG: Diverse Cross-Channel Fine-Grained Feature Learning and Progressive Fusion Siamese Tracker for Thermal Infrared Target Tracking
- URL: http://arxiv.org/abs/2504.14311v1
- Date: Sat, 19 Apr 2025 14:24:37 GMT
- Title: DCFG: Diverse Cross-Channel Fine-Grained Feature Learning and Progressive Fusion Siamese Tracker for Thermal Infrared Target Tracking
- Authors: Ruoyan Xiong, Yuke Hou, Princess Retor Torboh, Hui He, Huanbin Zhang, Yue Zhang, Yanpin Wang, Huipan Guan, Shang Zhang,
- Abstract summary: Cross-channel fine-grained feature learning network to suppress dominant target features.<n>Channel rearrangement mechanism to enhance efficient in-formation flow.<n> specialized cross-channel fine-grained loss function to guide feature groups toward distinct discriminative re-gions of the target.
- Score: 11.3097285242147
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To address the challenge of capturing highly discriminative features in ther-mal infrared (TIR) tracking, we propose a novel Siamese tracker based on cross-channel fine-grained feature learning and progressive fusion. First, we introduce a cross-channel fine-grained feature learning network that employs masks and suppression coefficients to suppress dominant target features, en-abling the tracker to capture more detailed and subtle information. The net-work employs a channel rearrangement mechanism to enhance efficient in-formation flow, coupled with channel equalization to reduce parameter count. Additionally, we incorporate layer-by-layer combination units for ef-fective feature extraction and fusion, thereby minimizing parameter redun-dancy and computational complexity. The network further employs feature redirection and channel shuffling strategies to better integrate fine-grained details. Second, we propose a specialized cross-channel fine-grained loss function designed to guide feature groups toward distinct discriminative re-gions of the target, thus improving overall target representation. This loss function includes an inter-channel loss term that promotes orthogonality be-tween channels, maximizing feature diversity and facilitating finer detail capture. Extensive experiments demonstrate that our proposed tracker achieves the highest accuracy, scoring 0.81 on the VOT-TIR 2015 and 0.78 on the VOT-TIR 2017 benchmark, while also outperforming other methods across all evaluation metrics on the LSOTB-TIR and PTB-TIR benchmarks.
Related papers
- FGSGT: Saliency-Guided Siamese Network Tracker Based on Key Fine-Grained Feature Information for Thermal Infrared Target Tracking [11.599952876425736]
We propose a novel saliency-guided Siamese network tracker based on key fine-grained feature infor-mation.<n>This design captures essential global features from shallow layers, enhances feature diversity, and minimizes the loss of fine-grained in-formation.<n>Experiment results demonstrate that the pro-posed tracker achieves the highest precision and success rates.
arXiv Detail & Related papers (2025-04-19T14:13:15Z) - MSCA-Net:Multi-Scale Context Aggregation Network for Infrared Small Target Detection [0.0]
This paper proposes a novel network architecture named MSCA-Net, which integrates three key components.<n>MSEDA employs a multi-scale feature fusion attention mechanism to adaptively aggregate information across different scales.<n>PCBAM captures the correlation between global and local features through a correlation matrix-based strategy.
arXiv Detail & Related papers (2025-03-21T14:42:31Z) - Distilling Channels for Efficient Deep Tracking [68.13422829310835]
This paper presents a novel framework termed channel distillation to facilitate deep trackers.
We show that an integrated formulation can turn feature compression, response map generation, and model update into a unified energy minimization problem.
The resulting deep tracker is accurate, fast, and has low memory requirements.
arXiv Detail & Related papers (2024-09-18T08:09:20Z) - SCTransNet: Spatial-channel Cross Transformer Network for Infrared Small Target Detection [46.049401912285134]
Infrared small target detection (IRSTD) has recently benefitted greatly from U-shaped neural models.
Existing techniques struggle when the target has high similarities with the background.
We present a Spatial-channel Cross Transformer Network (SCTransNet) that leverages spatial-channel cross transformer blocks.
arXiv Detail & Related papers (2024-01-28T06:41:15Z) - Improved Dense Nested Attention Network Based on Transformer for
Infrared Small Target Detection [8.388564430699155]
Infrared small target detection based on deep learning offers unique advantages in separating small targets from complex and dynamic backgrounds.
The features of infrared small targets gradually weaken as the depth of convolutional neural network (CNN) increases.
We propose improved dense nested attention network (IDNANet), which is based on the transformer architecture.
arXiv Detail & Related papers (2023-11-15T07:29:24Z) - Joint Channel Estimation and Feedback with Masked Token Transformers in
Massive MIMO Systems [74.52117784544758]
This paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix.
The entire encoder-decoder network is utilized for channel compression.
Our method outperforms state-of-the-art channel estimation and feedback techniques in joint tasks.
arXiv Detail & Related papers (2023-06-08T06:15:17Z) - CATRO: Channel Pruning via Class-Aware Trace Ratio Optimization [61.71504948770445]
We propose a novel channel pruning method via Class-Aware Trace Ratio Optimization (CATRO) to reduce the computational burden and accelerate the model inference.
We show that CATRO achieves higher accuracy with similar cost or lower cost with similar accuracy than other state-of-the-art channel pruning algorithms.
Because of its class-aware property, CATRO is suitable to prune efficient networks adaptively for various classification subtasks, enhancing handy deployment and usage of deep networks in real-world applications.
arXiv Detail & Related papers (2021-10-21T06:26:31Z) - Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures.
We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z) - CE-FPN: Enhancing Channel Information for Object Detection [12.954675966833372]
Feature pyramid network (FPN) has been an effective framework to extract multi-scale features in object detection.
We present a novel channel enhancement network (CE-FPN) with three simple yet effective modules to alleviate these problems.
Our experiments show that CE-FPN achieves competitive performance compared to state-of-the-art FPN-based detectors on MS COCO benchmark.
arXiv Detail & Related papers (2021-03-19T05:51:53Z) - Channel-wise Knowledge Distillation for Dense Prediction [73.99057249472735]
We propose to align features channel-wise between the student and teacher networks.
We consistently achieve superior performance on three benchmarks with various network structures.
arXiv Detail & Related papers (2020-11-26T12:00:38Z) - Operation-Aware Soft Channel Pruning using Differentiable Masks [51.04085547997066]
We propose a data-driven algorithm, which compresses deep neural networks in a differentiable way by exploiting the characteristics of operations.
We perform extensive experiments and achieve outstanding performance in terms of the accuracy of output networks.
arXiv Detail & Related papers (2020-07-08T07:44:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.