M$^5$L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking
- URL: http://arxiv.org/abs/2003.07650v1
- Date: Tue, 17 Mar 2020 11:37:56 GMT
- Title: M$^5$L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking
- Authors: Zhengzheng Tu, Chun Lin, Chenglong Li, Jin Tang and Bin Luo
- Abstract summary: Classifying the confusing samples in the course of RGBT tracking is a challenging problem.
We propose a novel Multi-Modal Multi-Margin Metric Learning framework, named M$5$L for RGBT tracking.
Our framework clearly improves the tracking performance and outperforms the state-of-the-art RGBT trackers.
- Score: 44.296318907168
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classifying the confusing samples in the course of RGBT tracking is a quite
challenging problem, which hasn't got satisfied solution. Existing methods only
focus on enlarging the boundary between positive and negative samples, however,
the structured information of samples might be harmed, e.g., confusing positive
samples are closer to the anchor than normal positive samples.To handle this
problem, we propose a novel Multi-Modal Multi-Margin Metric Learning framework,
named M$^5$L for RGBT tracking in this paper. In particular, we design a
multi-margin structured loss to distinguish the confusing samples which play a
most critical role in tracking performance boosting. To alleviate this problem,
we additionally enlarge the boundaries between confusing positive samples and
normal ones, between confusing negative samples and normal ones with predefined
margins, by exploiting the structured information of all samples in each
modality.Moreover, a cross-modality constraint is employed to reduce the
difference between modalities and push positive samples closer to the anchor
than negative ones from two modalities.In addition, to achieve quality-aware
RGB and thermal feature fusion, we introduce the modality attentions and learn
them using a feature fusion module in our network. Extensive experiments on
large-scale datasets testify that our framework clearly improves the tracking
performance and outperforms the state-of-the-art RGBT trackers.
Related papers
- Task-oriented Embedding Counts: Heuristic Clustering-driven Feature Fine-tuning for Whole Slide Image Classification [1.292108130501585]
We propose a clustering-driven feature fine-tuning method (HC-FT) to enhance the performance of multiple instance learning.
The proposed method is evaluated on both CAMELYON16 and BRACS datasets, achieving an AUC of 97.13% and 85.85%, respectively.
arXiv Detail & Related papers (2024-06-02T08:53:45Z) - Propensity Score Alignment of Unpaired Multimodal Data [3.8373578956681555]
Multimodal representation learning techniques typically rely on paired samples to learn common representations.
This paper presents an approach to address the challenge of aligning unpaired samples across disparate modalities in multimodal representation learning.
arXiv Detail & Related papers (2024-04-02T02:36:21Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Cluster-guided Contrastive Graph Clustering Network [53.16233290797777]
We propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC)
We construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks.
To construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples.
arXiv Detail & Related papers (2023-01-03T13:42:38Z) - ScatterSample: Diversified Label Sampling for Data Efficient Graph
Neural Network Learning [22.278779277115234]
In some applications where graph neural network (GNN) training is expensive, labeling new instances is expensive.
We develop a data-efficient active sampling framework, ScatterSample, to train GNNs under an active learning setting.
Our experiments on five datasets show that ScatterSample significantly outperforms the other GNN active learning baselines.
arXiv Detail & Related papers (2022-06-09T04:05:02Z) - Doubly Contrastive Deep Clustering [135.7001508427597]
We present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views.
Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs.
For the class view, we build the positive and negative pairs from the sample distribution of the class.
In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level.
arXiv Detail & Related papers (2021-03-09T15:15:32Z) - Robust Visual Tracking via Statistical Positive Sample Generation and
Gradient Aware Learning [28.60114425270413]
CNN based trackers have achieved state-of-the-art performance on multiple benchmark datasets.
We propose a robust tracking method via Statistical Positive sample generation and Gradient Aware learning (SPGA)
We show that the proposed SPGA performs favorably against several state-of-the-art trackers.
arXiv Detail & Related papers (2020-11-09T09:14:58Z) - Hard Negative Samples Emphasis Tracker without Anchors [10.616828072065093]
We address the problem that distinguishes the tracking target from hard negative samples in the tracking phase.
We propose a simple yet efficient hard negative samples emphasis method, which constrains Siamese network to learn features that are aware of hard negative samples.
We also explore a novel anchor-free tracking framework in a per-pixel prediction fashion.
arXiv Detail & Related papers (2020-08-08T12:38:38Z) - Multi-Scale Positive Sample Refinement for Few-Shot Object Detection [61.60255654558682]
Few-shot object detection (FSOD) helps detectors adapt to unseen classes with few training instances.
We propose a Multi-scale Positive Sample Refinement (MPSR) approach to enrich object scales in FSOD.
MPSR generates multi-scale positive samples as object pyramids and refines the prediction at various scales.
arXiv Detail & Related papers (2020-07-18T09:48:29Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.