SRRT: Exploring Search Region Regulation for Visual Object Tracking
- URL: http://arxiv.org/abs/2207.04438v4
- Date: Sun, 9 Jun 2024 04:45:49 GMT
- Title: SRRT: Exploring Search Region Regulation for Visual Object Tracking
- Authors: Jiawen Zhu, Xin Chen, Pengyu Zhang, Xinying Wang, Dong Wang, Wenda Zhao, Huchuan Lu,
- Abstract summary: We propose a novel tracking paradigm, called Search Region Regulation Tracking (SRRT)
SRRT applies a proposed search region regulator to estimate an optimal search region dynamically for each frame.
On the large-scale LaSOT benchmark, SRRT improves SiamRPN++ and TransT with absolute gains of 4.6% and 3.1% in terms of AUC.
- Score: 58.68120400180216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The dominant trackers generate a fixed-size rectangular region based on the previous prediction or initial bounding box as the model input, i.e., search region. While this manner obtains promising tracking efficiency, a fixed-size search region lacks flexibility and is likely to fail in some cases, e.g., fast motion and distractor interference. Trackers tend to lose the target object due to the limited search region or experience interference from distractors due to the excessive search region. Drawing inspiration from the pattern humans track an object, we propose a novel tracking paradigm, called Search Region Regulation Tracking (SRRT) that applies a small eyereach when the target is captured and zooms out the search field when the target is about to be lost. SRRT applies a proposed search region regulator to estimate an optimal search region dynamically for each frame, by which the tracker can flexibly respond to transient changes in the location of object occurrences. To adapt the object's appearance variation during online tracking, we further propose a lockingstate determined updating strategy for reference frame updating. The proposed SRRT is concise without bells and whistles, yet achieves evident improvements and competitive results with other state-of-the-art trackers on eight benchmarks. On the large-scale LaSOT benchmark, SRRT improves SiamRPN++ and TransT with absolute gains of 4.6% and 3.1% in terms of AUC. The code and models will be released.
Related papers
- REG: Refined Generalized Focal Loss for Road Asset Detection on Thai Highways Using Vision-Based Detection and Segmentation Models [0.0]
This paper introduces a novel framework for detecting and segmenting critical road assets on Thai highways.
Integrated into state-of-the-art vision-based detection and segmentation models, the proposed method effectively addresses class imbalance and the challenges of localizing small, underrepresented road elements.
arXiv Detail & Related papers (2024-09-15T22:04:33Z) - RTracker: Recoverable Tracking via PN Tree Structured Memory [71.05904715104411]
We propose a recoverable tracking framework, RTracker, that uses a tree-structured memory to dynamically associate a tracker and a detector to enable self-recovery.
Specifically, we propose a Positive-Negative Tree-structured memory to chronologically store and maintain positive and negative target samples.
Our core idea is to use the support samples of positive and negative target categories to establish a relative distance-based criterion for a reliable assessment of target loss.
arXiv Detail & Related papers (2024-03-28T08:54:40Z) - RTrack: Accelerating Convergence for Visual Object Tracking via
Pseudo-Boxes Exploration [3.29854706649876]
Single object tracking (SOT) heavily relies on the representation of the target object as a bounding box.
This paper proposes RTrack, a novel object representation baseline tracker.
RTrack automatically arranges points to define the spatial extents and highlight local areas.
arXiv Detail & Related papers (2023-09-23T04:41:59Z) - Detect Any Deepfakes: Segment Anything Meets Face Forgery Detection and
Localization [30.317619885984005]
We introduce the well-trained vision segmentation foundation model, i.e., Segment Anything Model (SAM) in face forgery detection and localization.
Based on SAM, we propose the Detect Any Deepfakes (DADF) framework with the Multiscale Adapter.
The proposed framework seamlessly integrates end-to-end forgery localization and detection optimization.
arXiv Detail & Related papers (2023-06-29T16:25:04Z) - Mesh-SORT: Simple and effective of location-wise tracker [0.0]
In most tracking scenarios, objects tend to move and be lost within specific locations.
We propose different strategies for tracking and association that can identify and target these regions.
We present a robust strategy for dealing with lost objects, as well as a location-wise method for tracking by detection.
arXiv Detail & Related papers (2023-02-28T08:47:53Z) - TDIOT: Target-driven Inference for Deep Video Object Tracking [0.2457872341625575]
In this work, we adopt the pre-trained Mask R-CNN deep object detector as the baseline.
We introduce a novel inference architecture placed on top of FPN-ResNet101 backbone of Mask R-CNN to jointly perform detection and tracking.
The proposed single object tracker, TDIOT, applies an appearance similarity-based temporal matching for data association.
arXiv Detail & Related papers (2021-03-19T20:45:06Z) - CRACT: Cascaded Regression-Align-Classification for Robust Visual
Tracking [97.84109669027225]
We introduce an improved proposal refinement module, Cascaded Regression-Align- Classification (CRAC)
CRAC yields new state-of-the-art performances on many benchmarks.
In experiments on seven benchmarks including OTB-2015, UAV123, NfS, VOT-2018, TrackingNet, GOT-10k and LaSOT, our CRACT exhibits very promising results in comparison with state-of-the-art competitors.
arXiv Detail & Related papers (2020-11-25T02:18:33Z) - Dense Label Encoding for Boundary Discontinuity Free Rotation Detection [69.75559390700887]
This paper explores a relatively less-studied methodology based on classification.
We propose new techniques to push its frontier in two aspects.
Experiments and visual analysis on large-scale public datasets for aerial images show the effectiveness of our approach.
arXiv Detail & Related papers (2020-11-19T05:42:02Z) - Collaborative Training between Region Proposal Localization and
Classification for Domain Adaptive Object Detection [121.28769542994664]
Domain adaptation for object detection tries to adapt the detector from labeled datasets to unlabeled ones for better performance.
In this paper, we are the first to reveal that the region proposal network (RPN) and region proposal classifier(RPC) demonstrate significantly different transferability when facing large domain gap.
arXiv Detail & Related papers (2020-09-17T07:39:52Z) - SADet: Learning An Efficient and Accurate Pedestrian Detector [68.66857832440897]
This paper proposes a series of systematic optimization strategies for the detection pipeline of one-stage detector.
It forms a single shot anchor-based detector (SADet) for efficient and accurate pedestrian detection.
Though structurally simple, it presents state-of-the-art result and real-time speed of $20$ FPS for VGA-resolution images.
arXiv Detail & Related papers (2020-07-26T12:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.