Tiny Object Tracking: A Large-scale Dataset and A Baseline
- URL: http://arxiv.org/abs/2202.05659v1
- Date: Fri, 11 Feb 2022 15:00:32 GMT
- Title: Tiny Object Tracking: A Large-scale Dataset and A Baseline
- Authors: Yabin Zhu, Chenglong Li, Yao Liu, Xiao Wang, Jin Tang, Bin Luo,
Zhixiang Huang
- Abstract summary: We create a large-scale video dataset, which contains 434 sequences with a total of more than 217K frames.
In data creation, we take 12 challenge attributes into account to cover a broad range of viewpoints and scene complexities.
We propose a novel Multilevel Knowledge Distillation Network (MKDNet), which pursues three-level knowledge distillations in a unified framework.
- Score: 40.93697515531104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tiny objects, frequently appearing in practical applications, have weak
appearance and features, and receive increasing interests in meany vision
tasks, such as object detection and segmentation. To promote the research and
development of tiny object tracking, we create a large-scale video dataset,
which contains 434 sequences with a total of more than 217K frames. Each frame
is carefully annotated with a high-quality bounding box. In data creation, we
take 12 challenge attributes into account to cover a broad range of viewpoints
and scene complexities, and annotate these attributes for facilitating the
attribute-based performance analysis. To provide a strong baseline in tiny
object tracking, we propose a novel Multilevel Knowledge Distillation Network
(MKDNet), which pursues three-level knowledge distillations in a unified
framework to effectively enhance the feature representation, discrimination and
localization abilities in tracking tiny objects. Extensive experiments are
performed on the proposed dataset, and the results prove the superiority and
effectiveness of MKDNet compared with state-of-the-art methods. The dataset,
the algorithm code, and the evaluation code are available at
https://github.com/mmic-lcl/Datasets-and-benchmark-code.
Related papers
- Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast
Contrastive Fusion [110.84357383258818]
We propose a novel approach to lift 2D segments to 3D and fuse them by means of a neural field representation.
The core of our approach is a slow-fast clustering objective function, which is scalable and well-suited for scenes with a large number of objects.
Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets.
arXiv Detail & Related papers (2023-06-07T17:57:45Z) - Uncertainty Aware Active Learning for Reconfiguration of Pre-trained
Deep Object-Detection Networks for New Target Domains [0.0]
Object detection is one of the most important and fundamental aspects of computer vision tasks.
To obtain training data for object detection model efficiently, many datasets opt to obtain their unannotated data in video format.
Annotating every frame from a video is costly and inefficient since many frames contain very similar information for the model to learn from.
In this paper, we proposed a novel active learning algorithm for object detection models to tackle this problem.
arXiv Detail & Related papers (2023-03-22T17:14:10Z) - BigDetection: A Large-scale Benchmark for Improved Object Detector
Pre-training [44.32782190757813]
We construct a new large-scale benchmark termed BigDetection.
Our dataset has 600 object categories and contains over 3.4M training images with 36M bounding boxes.
arXiv Detail & Related papers (2022-03-24T17:57:29Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Unsupervised Discovery of the Long-Tail in Instance Segmentation Using
Hierarchical Self-Supervision [3.841232411073827]
We propose a method that can perform unsupervised discovery of long-tail categories in instance segmentation.
Our model is able to discover novel and more fine-grained objects than the common categories.
We show that the model achieves competitive quantitative results on LVIS as compared to the supervised and partially supervised methods.
arXiv Detail & Related papers (2021-04-02T22:05:03Z) - Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical
Understanding of Outdoor Scene [76.4183572058063]
We present a richly-annotated 3D point cloud dataset for multiple outdoor scene understanding tasks.
The dataset has been point-wisely annotated with both hierarchical and instance-based labels.
We formulate a hierarchical learning problem for 3D point cloud segmentation and propose a measurement evaluating consistency across various hierarchies.
arXiv Detail & Related papers (2020-08-11T19:10:32Z) - Visual Tracking by TridentAlign and Context Embedding [71.60159881028432]
We propose novel TridentAlign and context embedding modules for Siamese network-based visual tracking methods.
The performance of the proposed tracker is comparable to that of state-of-the-art trackers, while the proposed tracker runs at real-time speed.
arXiv Detail & Related papers (2020-07-14T08:00:26Z) - TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average.
We ask annotators to label objects that move at any point in the video, and give names to them post factum.
Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.