Real-Time Visual Object Tracking via Few-Shot Learning
- URL: http://arxiv.org/abs/2103.10130v1
- Date: Thu, 18 Mar 2021 10:02:03 GMT
- Title: Real-Time Visual Object Tracking via Few-Shot Learning
- Authors: Jinghao Zhou, Bo Li, Peng Wang, Peixia Li, Weihao Gan, Wei Wu, Junjie
Yan, Wanli Ouyang
- Abstract summary: Visual Object Tracking (VOT) can be seen as an extended task of Few-Shot Learning (FSL)
We propose a two-stage framework that is capable of employing a large variety of FSL algorithms while presenting faster adaptation speed.
Experiments on the major benchmarks, VOT2018, OTB2015, NFS, UAV123, TrackingNet, and GOT-10k are conducted, demonstrating a desirable performance gain and a real-time speed.
- Score: 107.39695680340877
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Visual Object Tracking (VOT) can be seen as an extended task of Few-Shot
Learning (FSL). While the concept of FSL is not new in tracking and has been
previously applied by prior works, most of them are tailored to fit specific
types of FSL algorithms and may sacrifice running speed. In this work, we
propose a generalized two-stage framework that is capable of employing a large
variety of FSL algorithms while presenting faster adaptation speed. The first
stage uses a Siamese Regional Proposal Network to efficiently propose the
potential candidates and the second stage reformulates the task of classifying
these candidates to a few-shot classification problem. Following such a
coarse-to-fine pipeline, the first stage proposes informative sparse samples
for the second stage, where a large variety of FSL algorithms can be conducted
more conveniently and efficiently. As substantiation of the second stage, we
systematically investigate several forms of optimization-based few-shot
learners from previous works with different objective functions, optimization
methods, or solution space. Beyond that, our framework also entails a direct
application of the majority of other FSL algorithms to visual tracking,
enabling mutual communication between researchers on these two topics.
Extensive experiments on the major benchmarks, VOT2018, OTB2015, NFS, UAV123,
TrackingNet, and GOT-10k are conducted, demonstrating a desirable performance
gain and a real-time speed.
Related papers
- Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning [5.119396962985841]
Intermediate task transfer learning can greatly improve model performance.
We conduct the largest study on NLP task transferability and task selection with 12k source-target pairs.
Applying ESMs on a prior method reduces execution time and disk space usage by factors of 10 and 278, respectively.
arXiv Detail & Related papers (2024-10-19T16:22:04Z) - Cross-Domain Pre-training with Language Models for Transferable Time Series Representations [32.8353465232791]
CrossTimeNet is a novel cross-domain SSL learning framework to learn transferable knowledge from various domains.
One of the key characteristics of CrossTimeNet is the newly designed time series tokenization module.
We conduct extensive experiments in a real-world scenario across various time series classification domains.
arXiv Detail & Related papers (2024-03-19T02:32:47Z) - Rethinking Few-Shot Object Detection on a Multi-Domain Benchmark [28.818423712485504]
Multi-dOmain Few-Shot Object Detection (MoFSOD) benchmark consists of 10 datasets from a wide range of domains.
We analyze the impacts of freezing layers, different architectures, and different pre-training datasets on FSOD performance.
arXiv Detail & Related papers (2022-07-22T16:13:22Z) - Sylph: A Hypernetwork Framework for Incremental Few-shot Object
Detection [8.492340530784697]
We show that finetune-free iFSD can be highly effective when a large number of base categories with abundant data are available for meta-training.
We benchmark our model on both COCO and LVIS, reporting as high as $17%$ AP on the long-tail rare classes on LVIS.
arXiv Detail & Related papers (2022-03-25T20:39:00Z) - DATA: Domain-Aware and Task-Aware Pre-training [94.62676913928831]
We present DATA, a simple yet effective NAS approach specialized for self-supervised learning (SSL)
Our method achieves promising results across a wide range of computation costs on downstream tasks, including image classification, object detection and semantic segmentation.
arXiv Detail & Related papers (2022-03-17T02:38:49Z) - A Strong Baseline for Semi-Supervised Incremental Few-Shot Learning [54.617688468341704]
Few-shot learning aims to learn models that generalize to novel classes with limited training samples.
We propose a novel paradigm containing two parts: (1) a well-designed meta-training algorithm for mitigating ambiguity between base and novel classes caused by unreliable pseudo labels and (2) a model adaptation mechanism to learn discriminative features for novel classes while preserving base knowledge using few labeled and all the unlabeled data.
arXiv Detail & Related papers (2021-10-21T13:25:52Z) - Efficient Nearest Neighbor Language Models [114.40866461741795]
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore.
We show how to achieve up to a 6x speed-up in inference speed while retaining comparable performance.
arXiv Detail & Related papers (2021-09-09T12:32:28Z) - On Second-order Optimization Methods for Federated Learning [59.787198516188425]
We evaluate the performance of several second-order distributed methods with local steps in the federated learning setting.
We propose a novel variant that uses second-order local information for updates and a global line search to counteract the resulting local specificity.
arXiv Detail & Related papers (2021-09-06T12:04:08Z) - TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot
classification [50.358839666165764]
We show that the Task-Adaptive Feature Sub-Space Learning (TAFSSL) can significantly boost the performance in Few-Shot Learning scenarios.
Specifically, we show that on the challenging miniImageNet and tieredImageNet benchmarks, TAFSSL can improve the current state-of-the-art in both transductive and semi-supervised FSL settings by more than $5%$.
arXiv Detail & Related papers (2020-03-14T16:59:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.