Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box
Estimation
- URL: http://arxiv.org/abs/2012.06815v2
- Date: Mon, 29 Mar 2021 03:53:00 GMT
- Title: Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box
Estimation
- Authors: Bin Yan, Xinyu Zhang, Dong Wang, Huchuan Lu, Xiaoyun Yang
- Abstract summary: This work proposes a novel, flexible, and accurate refinement module called Alpha-Refine.
It can significantly improve the base trackers' box estimation quality.
Experiments on TrackingNet, LaSOT, GOT-10K, and VOT 2020 benchmarks show that our approach significantly improves the base trackers' performance with little extra latency.
- Score: 85.22775182688798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual object tracking aims to precisely estimate the bounding box for the
given target, which is a challenging problem due to factors such as deformation
and occlusion. Many recent trackers adopt the multiple-stage tracking strategy
to improve the quality of bounding box estimation. These methods first coarsely
locate the target and then refine the initial prediction in the following
stages. However, existing approaches still suffer from limited precision, and
the coupling of different stages severely restricts the method's
transferability. This work proposes a novel, flexible, and accurate refinement
module called Alpha-Refine (AR), which can significantly improve the base
trackers' box estimation quality. By exploring a series of design options, we
conclude that the key to successful refinement is extracting and maintaining
detailed spatial information as much as possible. Following this principle,
Alpha-Refine adopts a pixel-wise correlation, a corner prediction head, and an
auxiliary mask head as the core components. Comprehensive experiments on
TrackingNet, LaSOT, GOT-10K, and VOT2020 benchmarks with multiple base trackers
show that our approach significantly improves the base trackers' performance
with little extra latency. The proposed Alpha-Refine method leads to a series
of strengthened trackers, among which the ARSiamRPN (AR strengthened SiamRPNpp)
and the ARDiMP50 (ARstrengthened DiMP50) achieve good efficiency-precision
trade-off, while the ARDiMPsuper (AR strengthened DiMP-super) achieves very
competitive performance at a real-time speed. Code and pretrained models are
available at https://github.com/MasterBin-IIAU/AlphaRefine.
Related papers
- PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection [14.396629790635474]
Single point supervised oriented object detection has gained attention and made initial progress within the community.
We propose PointOBB-v2, a simpler, faster, and stronger method to generate pseudo rotated boxes from points without relying on any other prior.
arXiv Detail & Related papers (2024-10-10T17:59:56Z) - Dense Optical Tracking: Connecting the Dots [82.79642869586587]
DOT is a novel, simple and efficient method for solving the problem of point tracking in a video.
We show that DOT is significantly more accurate than current optical flow techniques, outperforms sophisticated "universal trackers" like OmniMotion, and is on par with, or better than, the best point tracking algorithms like CoTracker.
arXiv Detail & Related papers (2023-12-01T18:59:59Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - Lite-FPN for Keypoint-based Monocular 3D Object Detection [18.03406686769539]
Keypoint-based monocular 3D object detection has made tremendous progress and achieved great speed-accuracy trade-off.
We propose a sort of lightweight feature pyramid network called Lite-FPN to achieve multi-scale feature fusion.
Our proposed method achieves significantly higher accuracy and frame rate at the same time.
arXiv Detail & Related papers (2021-05-01T14:44:31Z) - Robust Long-Term Object Tracking via Improved Discriminative Model
Prediction [77.72450371348016]
We propose an improved discriminative model prediction method for robust long-term tracking based on a pre-trained short-term tracker.
The proposed method achieves comparable performance to the state-of-the-art long-term trackers.
arXiv Detail & Related papers (2020-08-11T14:31:11Z) - Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box
Estimation [87.53808756910452]
We propose a novel, flexible and accurate refinement module called Alpha-Refine.
It exploits a precise pixel-wise correlation layer together with a spatial-aware non-local layer to fuse features and can predict three complementary outputs: bounding box, corners and mask.
We apply the proposed Alpha-Refine module to five famous and state-of-the-art base trackers: DiMP, ATOM, SiamRPN++, RTMDNet and ECO.
arXiv Detail & Related papers (2020-07-04T07:02:25Z) - Learning to Optimize Non-Rigid Tracking [54.94145312763044]
We employ learnable optimizations to improve robustness and speed up solver convergence.
First, we upgrade the tracking objective by integrating an alignment data term on deep features which are learned end-to-end through CNN.
Second, we bridge the gap between the preconditioning technique and learning method by introducing a ConditionNet which is trained to generate a preconditioner.
arXiv Detail & Related papers (2020-03-27T04:40:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.