Challenge-Aware RGBT Tracking
- URL: http://arxiv.org/abs/2007.13143v1
- Date: Sun, 26 Jul 2020 15:11:44 GMT
- Title: Challenge-Aware RGBT Tracking
- Authors: Chenglong Li, Lei Liu, Andong Lu, Qing Ji, and Jin Tang
- Abstract summary: We propose a novel challenge-aware neural network to handle the modality-shared challenges and the modality-specific ones.
We show that our method operates at a real-time speed while performing well against the state-of-the-art methods on three benchmark datasets.
- Score: 32.88141817679821
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: RGB and thermal source data suffer from both shared and specific challenges,
and how to explore and exploit them plays a critical role to represent the
target appearance in RGBT tracking. In this paper, we propose a novel
challenge-aware neural network to handle the modality-shared challenges (e.g.,
fast motion, scale variation and occlusion) and the modality-specific ones
(e.g., illumination variation and thermal crossover) for RGBT tracking. In
particular, we design several parameter-shared branches in each layer to model
the target appearance under the modality-shared challenges, and several
parameterindependent branches under the modality-specific ones. Based on the
observation that the modality-specific cues of different modalities usually
contains the complementary advantages, we propose a guidance module to transfer
discriminative features from one modality to another one, which could enhance
the discriminative ability of some weak modality. Moreover, all branches are
aggregated together in an adaptive manner and parallel embedded in the backbone
network to efficiently form more discriminative target representations. These
challenge-aware branches are able to model the target appearance under certain
challenges so that the target representations can be learnt by a few parameters
even in the situation of insufficient training data. From the experimental
results we will show that our method operates at a real-time speed while
performing well against the state-of-the-art methods on three benchmark
datasets.
Related papers
- Towards a Generalist and Blind RGB-X Tracker [91.36268768952755]
We develop a single model tracker that can remain blind to any modality X during inference time.
Our training process is extremely simple, integrating multi-label classification loss with a routing function.
Our generalist and blind tracker can achieve competitive performance compared to well-established modal-specific models.
arXiv Detail & Related papers (2024-05-28T03:00:58Z) - Modality Prompts for Arbitrary Modality Salient Object Detection [57.610000247519196]
This paper delves into the task of arbitrary modality salient object detection (AM SOD)
It aims to detect salient objects from arbitrary modalities, eg RGB images, RGB-D images, and RGB-D-T images.
A novel modality-adaptive Transformer (MAT) will be proposed to investigate two fundamental challenges of AM SOD.
arXiv Detail & Related papers (2024-05-06T11:02:02Z) - Cross-Modality Perturbation Synergy Attack for Person Re-identification [66.48494594909123]
The main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities.
Existing attack methods have primarily focused on the characteristics of the visible image modality.
This study proposes a universal perturbation attack specifically designed for cross-modality ReID.
arXiv Detail & Related papers (2024-01-18T15:56:23Z) - Modality-missing RGBT Tracking: Invertible Prompt Learning and High-quality Benchmarks [21.139161163767884]
Modal information might miss due to factors such as thermal sensor self-calibration and data transmission error.
We propose a novel invertible prompt learning approach, which integrates the content-preserving prompts into a well-trained tracking model.
Our method achieves significant performance improvements compared with state-of-the-art methods.
arXiv Detail & Related papers (2023-12-25T11:39:00Z) - Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts
in Underspecified Visual Tasks [92.32670915472099]
We propose an ensemble diversification framework exploiting the generation of synthetic counterfactuals using Diffusion Probabilistic Models (DPMs)
We show that diffusion-guided diversification can lead models to avert attention from shortcut cues, achieving ensemble diversity performance comparable to previous methods requiring additional data collection.
arXiv Detail & Related papers (2023-10-03T17:37:52Z) - One-stage Modality Distillation for Incomplete Multimodal Learning [7.791488931628906]
This paper presents a one-stage modality distillation framework that unifies the privileged knowledge transfer and modality information fusion.
The proposed framework can overcome the problem of incomplete modality input in various scenes and achieve state-of-the-art performance.
arXiv Detail & Related papers (2023-09-15T07:12:27Z) - CMTR: Cross-modality Transformer for Visible-infrared Person
Re-identification [38.96033760300123]
Cross-modality transformer-based method (CMTR) for visible-infrared person re-identification task.
We design the novel modality embeddings, which are fused with token embeddings to encode modalities' information.
Our proposed CMTR model's performance significantly surpasses existing outstanding CNN-based methods.
arXiv Detail & Related papers (2021-10-18T03:12:59Z) - RGBT Tracking via Multi-Adapter Network with Hierarchical Divergence
Loss [37.99375824040946]
We propose a novel multi-adapter network to jointly perform modality-shared, modality-specific and instance-aware target representation learning.
Experiments on two RGBT tracking benchmark datasets demonstrate the outstanding performance of the proposed tracker.
arXiv Detail & Related papers (2020-11-14T01:50:46Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.