Related papers: BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

URL: http://arxiv.org/abs/2203.13249v1
Date: Thu, 24 Mar 2022 17:57:29 GMT
Title: BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
Authors: Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li and Xiangyang Xue
Abstract summary: We construct a new large-scale benchmark termed BigDetection. Our dataset has 600 object categories and contains over 3.4M training images with 36M bounding boxes.
Score: 44.32782190757813
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multiple datasets and open challenges for object detection have been introduced in recent years. To build more general and powerful object detection systems, in this paper, we construct a new large-scale benchmark termed BigDetection. Our goal is to simply leverage the training data from existing datasets (LVIS, OpenImages and Object365) with carefully designed principles, and curate a larger dataset for improved detector pre-training. Specifically, we generate a new taxonomy which unifies the heterogeneous label spaces from different sources. Our BigDetection dataset has 600 object categories and contains over 3.4M training images with 36M bounding boxes. It is much larger in multiple dimensions than previous benchmarks, which offers both opportunities and challenges. Extensive experiments demonstrate its validity as a new benchmark for evaluating different object detection methods, and its effectiveness as a pre-training dataset.

Related papers

Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning [51.170479006249195]
We introduce a new dataset, benchmark, and a dynamic coarse-to-fine learning scheme in this study. Our proposed dataset, AI-TOD-R, features the smallest object sizes among all oriented object detection datasets. We present a benchmark spanning a broad range of detection paradigms, including both fully-supervised and label-efficient approaches.
arXiv Detail & Related papers (2024-12-16T09:14:32Z)
Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation [58.37525311718006]
We put forth a novel formulation of the aerial object detection problem, namely open-vocabulary aerial object detection (OVAD) We propose CastDet, a CLIP-activated student-teacher detection framework that serves as the first OVAD detector specifically designed for the challenging aerial scenario. Our framework integrates a robust localization teacher along with several box selection strategies to generate high-quality proposals for novel objects.
arXiv Detail & Related papers (2024-11-04T12:59:13Z)
SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection. Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets. To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z)
Proposal-Contrastive Pretraining for Object Detection from Fewer Data [11.416621957617334]
We present Proposal Selection Contrast (ProSeCo), a novel unsupervised overall pretraining approach. ProSeCo uses the large number of object proposals generated by the detector for contrastive learning. We show that our method outperforms state of the art in unsupervised pretraining for object detection on standard and novel benchmarks.
arXiv Detail & Related papers (2023-10-25T17:59:26Z)
MDT3D: Multi-Dataset Training for LiDAR 3D Object Detection Generalization [3.8243923744440926]
3D object detection models trained on a source dataset with a specific point distribution have shown difficulties in generalizing to unseen datasets. We leverage the information available from several annotated source datasets with our Multi-Dataset Training for 3D Object Detection (MDT3D) method. We show how we managed the mix of datasets during training and finally introduce a new cross-dataset augmentation method: cross-dataset object injection.
arXiv Detail & Related papers (2023-08-02T08:20:00Z)
Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for Autonomous Driving [91.39625612027386]
We propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes. Specifically, we analyze in-depth differences between images and point clouds, and then present a practical principle for the few-shot setting in the 3D LiDAR dataset. To solve this task, we propose an incremental fine-tuning method to extend existing 3D detection models to recognize both common and rare objects.
arXiv Detail & Related papers (2023-02-08T07:11:36Z)
Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding [137.3719377780593]
A new design (named Detection Hub) is dataset-aware and category-aligned. It mitigates the dataset inconsistency and provides coherent guidance for the detector to learn across multiple datasets. The categories across datasets are semantically aligned into a unified space by replacing one-hot category representations with word embedding.
arXiv Detail & Related papers (2022-06-07T17:59:44Z)
Tiny Object Tracking: A Large-scale Dataset and A Baseline [40.93697515531104]
We create a large-scale video dataset, which contains 434 sequences with a total of more than 217K frames. In data creation, we take 12 challenge attributes into account to cover a broad range of viewpoints and scene complexities. We propose a novel Multilevel Knowledge Distillation Network (MKDNet), which pursues three-level knowledge distillations in a unified framework.
arXiv Detail & Related papers (2022-02-11T15:00:32Z)
TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average. We ask annotators to label objects that move at any point in the video, and give names to them post factum. Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.