Progressive Domain Adaptation for Thermal Infrared Object Tracking
- URL: http://arxiv.org/abs/2407.19430v2
- Date: Tue, 3 Sep 2024 09:42:24 GMT
- Title: Progressive Domain Adaptation for Thermal Infrared Object Tracking
- Authors: Qiao Li, Kanlun Tan, Qiao Liu, Di Yuan, Xin Li, Yunpeng Liu,
- Abstract summary: In this work, we propose a Progressive Domain Adaptation framework for TIR Tracking.
The framework makes full use of large-scale labeled RGB datasets without requiring time-consuming and labor-intensive labeling of large-scale TIR data.
Experimental results on five TIR tracking benchmarks show that the proposed method gains a nearly 6% success rate, demonstrating its effectiveness.
- Score: 9.888266596236578
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the lack of large-scale labeled Thermal InfraRed (TIR) training datasets, most existing TIR trackers are trained directly on RGB datasets. However, tracking methods trained on RGB datasets suffer a significant drop-off in TIR data due to the domain shift issue. To this end, in this work, we propose a Progressive Domain Adaptation framework for TIR Tracking (PDAT), which transfers useful knowledge learned from RGB tracking to TIR tracking. The framework makes full use of large-scale labeled RGB datasets without requiring time-consuming and labor-intensive labeling of large-scale TIR data. Specifically, we first propose an adversarial-based global domain adaptation module to reduce domain gap on the feature level coarsely. Second, we design a clustering-based subdomain adaptation method to further align the feature distributions of the RGB and TIR datasets finely. These two domain adaptation modules gradually eliminate the discrepancy between the two domains, and thus learn domain-invariant fine-grained features through progressive training. Additionally, we collect a largescale TIR dataset with over 1.48 million unlabeled TIR images for training the proposed domain adaptation framework. Experimental results on five TIR tracking benchmarks show that the proposed method gains a nearly 6% success rate, demonstrating its effectiveness.
Related papers
- Cross-domain Learning Framework for Tracking Users in RIS-aided Multi-band ISAC Systems with Sparse Labeled Data [55.70071704247794]
Integrated sensing and communications (ISAC) is pivotal for 6G communications and is boosted by the rapid development of reconfigurable intelligent surfaces (RISs)
This paper proposes the X2Track framework, where we model the tracking function by a hierarchical architecture, jointly utilizing multi-modal CSI indicators across multiple bands, and optimize it in a cross-domain manner.
Under X2Track, we design an efficient deep learning algorithm to minimize tracking errors, based on transformer neural networks and adversarial learning techniques.
arXiv Detail & Related papers (2024-05-10T08:04:27Z) - Thermal-Infrared Remote Target Detection System for Maritime Rescue
based on Data Augmentation with 3D Synthetic Data [4.66313002591741]
This paper proposes a thermal-infrared (TIR) remote target detection system for maritime rescue using deep learning and data augmentation.
To address dataset scarcity and improve model robustness, a synthetic dataset from a 3D game (ARMA3) to augment the data is collected.
The proposed segmentation model surpasses the performance of state-of-the-art segmentation methods.
arXiv Detail & Related papers (2023-10-31T12:37:49Z) - Visible-Infrared Person Re-Identification Using Privileged Intermediate
Information [10.816003787786766]
Cross-modal person re-identification (ReID) is challenging due to the large domain shift in data distributions between RGB and IR modalities.
This paper introduces a novel approach for a creating intermediate virtual domain that acts as bridges between the two main domains.
We devised a new method to generate images between visible and infrared domains that provide additional information to train a deep ReID model.
arXiv Detail & Related papers (2022-09-19T21:08:14Z) - Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient
Object Detection [67.33924278729903]
In this work, we propose Dual Swin-Transformer based Mutual Interactive Network.
We adopt Swin-Transformer as the feature extractor for both RGB and depth modality to model the long-range dependencies in visual inputs.
Comprehensive experiments on five standard RGB-D SOD benchmark datasets demonstrate the superiority of the proposed DTMINet method.
arXiv Detail & Related papers (2022-06-07T08:35:41Z) - Temporal Aggregation for Adaptive RGBT Tracking [14.00078027541162]
We propose an RGBT tracker which takes clues into account for robust appearance model learning.
Unlike most existing RGBT trackers that implement object tracking tasks with only spatial information included, temporal information is further considered in this method.
arXiv Detail & Related papers (2022-01-22T02:31:56Z) - Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking [39.505507508776404]
The target representation learned by convolutional neural networks plays an important role in Thermal Infrared (TIR) tracking.
We propose to distill representations of the TIR modality from the RGB modality with Cross-Modal Distillation (CMD)
Our tracker outperforms the baseline tracker by achieving absolute gains of 2.3% Success, 2.7% Precision, and 2.5% Normalized Precision respectively.
arXiv Detail & Related papers (2021-07-31T09:19:59Z) - Self-Supervised Representation Learning for RGB-D Salient Object
Detection [93.17479956795862]
We use Self-Supervised Representation Learning to design two pretext tasks: the cross-modal auto-encoder and the depth-contour estimation.
Our pretext tasks require only a few and un RGB-D datasets to perform pre-training, which make the network capture rich semantic contexts.
For the inherent problem of cross-modal fusion in RGB-D SOD, we propose a multi-path fusion module.
arXiv Detail & Related papers (2021-01-29T09:16:06Z) - Collaborative Training between Region Proposal Localization and
Classification for Domain Adaptive Object Detection [121.28769542994664]
Domain adaptation for object detection tries to adapt the detector from labeled datasets to unlabeled ones for better performance.
In this paper, we are the first to reveal that the region proposal network (RPN) and region proposal classifier(RPC) demonstrate significantly different transferability when facing large domain gap.
arXiv Detail & Related papers (2020-09-17T07:39:52Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z) - Synergistic saliency and depth prediction for RGB-D saliency detection [76.27406945671379]
Existing RGB-D saliency datasets are small, which may lead to overfitting and limited generalization for diverse scenarios.
We propose a semi-supervised system for RGB-D saliency detection that can be trained on smaller RGB-D saliency datasets without saliency ground truth.
arXiv Detail & Related papers (2020-07-03T14:24:41Z) - Unsupervised Domain Adaptation through Inter-modal Rotation for RGB-D
Object Recognition [31.24587317555857]
We propose a novel RGB-D DA method that reduces the synthetic-to-real domain shift by exploiting the inter-modal relation between the RGB and depth image.
Our method consists of training a convolutional neural network to solve, in addition to the main recognition task, the pretext task of predicting the relative rotation between the RGB and depth image.
arXiv Detail & Related papers (2020-04-21T13:53:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.