Fast Template Matching and Update for Video Object Tracking and
Segmentation
- URL: http://arxiv.org/abs/2004.07538v1
- Date: Thu, 16 Apr 2020 08:58:45 GMT
- Title: Fast Template Matching and Update for Video Object Tracking and
Segmentation
- Authors: Mingjie Sun, Jimin Xiao, Eng Gee Lim, Bingfeng Zhang, Yao Zhao
- Abstract summary: The main task we aim to tackle is the multi-instance semi-supervised video object segmentation across a sequence of frames.
The challenges lie in the selection of the matching method to predict the result as well as to decide whether to update the target template.
We propose a novel approach which utilizes reinforcement learning to make these two decisions at the same time.
- Score: 56.465510428878
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, the main task we aim to tackle is the multi-instance
semi-supervised video object segmentation across a sequence of frames where
only the first-frame box-level ground-truth is provided. Detection-based
algorithms are widely adopted to handle this task, and the challenges lie in
the selection of the matching method to predict the result as well as to decide
whether to update the target template using the newly predicted result. The
existing methods, however, make these selections in a rough and inflexible way,
compromising their performance. To overcome this limitation, we propose a novel
approach which utilizes reinforcement learning to make these two decisions at
the same time. Specifically, the reinforcement learning agent learns to decide
whether to update the target template according to the quality of the predicted
result. The choice of the matching method will be determined at the same time,
based on the action history of the reinforcement learning agent. Experiments
show that our method is almost 10 times faster than the previous
state-of-the-art method with even higher accuracy (region similarity of 69.1%
on DAVIS 2017 dataset).
Related papers
- Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Fast Classification with Sequential Feature Selection in Test Phase [1.1470070927586016]
This paper introduces a novel approach to active feature acquisition for classification.
It is the task of sequentially selecting the most informative subset of features to achieve optimal prediction performance.
The proposed approach involves a new lazy model that is significantly faster and more efficient compared to existing methods.
arXiv Detail & Related papers (2023-06-25T21:31:46Z) - Regularizing Second-Order Influences for Continual Learning [39.16131410356833]
Continual learning aims to learn on non-stationary data streams without catastrophically forgetting previous knowledge.
Prevalent replay-based methods address this challenge by rehearsing on a small buffer holding the seen data.
We dissect the interaction of sequential selection steps within a framework built on influence functions.
arXiv Detail & Related papers (2023-04-20T09:30:35Z) - Deep Active Ensemble Sampling For Image Classification [8.31483061185317]
Active learning frameworks aim to reduce the cost of data annotation by actively requesting the labeling for the most informative data points.
Some proposed approaches include uncertainty-based techniques, geometric methods, implicit combination of uncertainty-based and geometric approaches.
We present an innovative integration of recent progress in both uncertainty-based and geometric frameworks to enable an efficient exploration/exploitation trade-off in sample selection strategy.
Our framework provides two advantages: (1) accurate posterior estimation, and (2) tune-able trade-off between computational overhead and higher accuracy.
arXiv Detail & Related papers (2022-10-11T20:20:20Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Towards General and Efficient Active Learning [20.888364610175987]
Active learning aims to select the most informative samples to exploit limited annotation budgets.
We propose a novel general and efficient active learning (GEAL) method in this paper.
Our method can conduct data selection processes on different datasets with a single-pass inference of the same model.
arXiv Detail & Related papers (2021-12-15T08:35:28Z) - Label, Verify, Correct: A Simple Few Shot Object Detection Method [93.84801062680786]
We introduce a simple pseudo-labelling method to source high-quality pseudo-annotations from a training set.
We present two novel methods to improve the precision of the pseudo-labelling process.
Our method achieves state-of-the-art or second-best performance compared to existing approaches.
arXiv Detail & Related papers (2021-12-10T18:59:06Z) - Automated Decision-based Adversarial Attacks [48.01183253407982]
We consider the practical and challenging decision-based black-box adversarial setting.
Under this setting, the attacker can only acquire the final classification labels by querying the target model.
We propose to automatically discover decision-based adversarial attack algorithms.
arXiv Detail & Related papers (2021-05-09T13:15:10Z) - Learning Salient Boundary Feature for Anchor-free Temporal Action
Localization [81.55295042558409]
Temporal action localization is an important yet challenging task in video understanding.
We propose the first purely anchor-free temporal localization method.
Our model includes (i) an end-to-end trainable basic predictor, (ii) a saliency-based refinement module, and (iii) several consistency constraints.
arXiv Detail & Related papers (2021-03-24T12:28:32Z) - Semi-supervised Facial Action Unit Intensity Estimation with Contrastive
Learning [54.90704746573636]
Our method does not require to manually select key frames, and produces state-of-the-art results with as little as $2%$ of annotated frames.
We experimentally validate that our method outperforms existing methods when working with as little as $2%$ of randomly chosen data.
arXiv Detail & Related papers (2020-11-03T17:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.