Explicitly Modeling the Discriminability for Instance-Aware Visual
Object Tracking
- URL: http://arxiv.org/abs/2110.15030v1
- Date: Thu, 28 Oct 2021 11:24:01 GMT
- Title: Explicitly Modeling the Discriminability for Instance-Aware Visual
Object Tracking
- Authors: Mengmeng Wang, Xiaoqian Yang, and Yong Liu
- Abstract summary: We propose a novel Instance-Aware Tracker (IAT) to excavate the discriminability of feature representations.
We implement two variants of the proposed IAT, including a video-level one and an object-level one.
Both versions achieve leading results against state-of-the-art methods while running at 30FPS.
- Score: 13.311777431243296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual object tracking performance has been dramatically improved in recent
years, but some severe challenges remain open, like distractors and occlusions.
We suspect the reason is that the feature representations of the tracking
targets are only expressively learned but not fully discriminatively modeled.
In this paper, we propose a novel Instance-Aware Tracker (IAT) to explicitly
excavate the discriminability of feature representations, which improves the
classical visual tracking pipeline with an instance-level classifier. First, we
introduce a contrastive learning mechanism to formulate the classification
task, ensuring that every training sample could be uniquely modeled and be
highly distinguishable from plenty of other samples. Besides, we design an
effective negative sample selection scheme to contain various intra and inter
classes in the instance classification branch. Furthermore, we implement two
variants of the proposed IAT, including a video-level one and an object-level
one. They realize the concept of \textbf{instance} in different granularity as
videos and target bounding boxes, respectively. The former enhances the ability
to recognize the target from the background while the latter boosts the
discriminative power for mitigating the target-distractor dilemma. Extensive
experimental evaluations on 8 benchmark datasets show that both two versions of
the proposed IAT achieve leading results against state-of-the-art methods while
running at 30FPS. Code will be available when it is published.
Related papers
- Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation [58.37525311718006]
We put forth a novel formulation of the aerial object detection problem, namely open-vocabulary aerial object detection (OVAD)
We propose CastDet, a CLIP-activated student-teacher detection framework that serves as the first OVAD detector specifically designed for the challenging aerial scenario.
Our framework integrates a robust localization teacher along with several box selection strategies to generate high-quality proposals for novel objects.
arXiv Detail & Related papers (2024-11-04T12:59:13Z) - Negative Prototypes Guided Contrastive Learning for WSOD [8.102080369924911]
Weakly Supervised Object Detection (WSOD) with only image-level annotation has recently attracted wide attention.
We propose the Negative Prototypes Guided Contrastive learning architecture.
Our proposed method achieves the state-of-the-art performance.
arXiv Detail & Related papers (2024-06-04T08:16:26Z) - Exploring Robust Features for Few-Shot Object Detection in Satellite
Imagery [17.156864650143678]
We develop a few-shot object detector based on a traditional two-stage architecture.
A large-scale pre-trained model is used to build class-reference embeddings or prototypes.
We perform evaluations on two remote sensing datasets containing challenging and rare objects.
arXiv Detail & Related papers (2024-03-08T15:20:27Z) - OVTrack: Open-Vocabulary Multiple Object Tracking [64.73379741435255]
OVTrack is an open-vocabulary tracker capable of tracking arbitrary object classes.
It sets a new state-of-the-art on the large-scale, large-vocabulary TAO benchmark.
arXiv Detail & Related papers (2023-04-17T16:20:05Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z) - Instance-Level Relative Saliency Ranking with Graph Reasoning [126.09138829920627]
We present a novel unified model to segment salient instances and infer relative saliency rank order.
A novel loss function is also proposed to effectively train the saliency ranking branch.
experimental results demonstrate that our proposed model is more effective than previous methods.
arXiv Detail & Related papers (2021-07-08T13:10:42Z) - Train a One-Million-Way Instance Classifier for Unsupervised Visual
Representation Learning [45.510042484456854]
This paper presents a simple unsupervised visual representation learning method with a pretext task of discriminating all images in a dataset using a parametric, instance-level computation.
The overall framework is a replica of a supervised classification model, where semantic classes (e.g., dog, bird, and ship) are replaced by instance IDs.
scaling up the classification task from thousands of semantic labels to millions of instance labels brings specific challenges including 1) the large-scale softmax classifier; 2) the slow convergence due to the infrequent visiting of instance samples; and 3) the massive number of negative classes that can be noisy.
arXiv Detail & Related papers (2021-02-09T14:44:18Z) - Synthesizing the Unseen for Zero-shot Object Detection [72.38031440014463]
We propose to synthesize visual features for unseen classes, so that the model learns both seen and unseen objects in the visual domain.
We use a novel generative model that uses class-semantics to not only generate the features but also to discriminatively separate them.
arXiv Detail & Related papers (2020-10-19T12:36:11Z) - CSI: Novelty Detection via Contrastive Learning on Distributionally
Shifted Instances [77.28192419848901]
We propose a simple, yet effective method named contrasting shifted instances (CSI)
In addition to contrasting a given sample with other instances as in conventional contrastive learning methods, our training scheme contrasts the sample with distributionally-shifted augmentations of itself.
Our experiments demonstrate the superiority of our method under various novelty detection scenarios.
arXiv Detail & Related papers (2020-07-16T08:32:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.