TJU-DHD: A Diverse High-Resolution Dataset for Object Detection
- URL: http://arxiv.org/abs/2011.09170v1
- Date: Wed, 18 Nov 2020 09:32:24 GMT
- Title: TJU-DHD: A Diverse High-Resolution Dataset for Object Detection
- Authors: Yanwei Pang and Jiale Cao and Yazhao Li and Jin Xie and Hanqing Sun
and Jinfeng Gong
- Abstract summary: Large-scale, rich-diversity, and high-resolution datasets play an important role in developing better object detection methods.
We build a diverse high-resolution dataset (called TJU-DHD)
The dataset contains 115,354 high-resolution images and 709,330 labeled objects with a large variance in scale and appearance.
- Score: 48.94731638729273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vehicles, pedestrians, and riders are the most important and interesting
objects for the perception modules of self-driving vehicles and video
surveillance. However, the state-of-the-art performance of detecting such
important objects (esp. small objects) is far from satisfying the demand of
practical systems. Large-scale, rich-diversity, and high-resolution datasets
play an important role in developing better object detection methods to satisfy
the demand. Existing public large-scale datasets such as MS COCO collected from
websites do not focus on the specific scenarios. Moreover, the popular datasets
(e.g., KITTI and Citypersons) collected from the specific scenarios are limited
in the number of images and instances, the resolution, and the diversity. To
attempt to solve the problem, we build a diverse high-resolution dataset
(called TJU-DHD). The dataset contains 115,354 high-resolution images (52%
images have a resolution of 1624$\times$1200 pixels and 48% images have a
resolution of at least 2,560$\times$1,440 pixels) and 709,330 labeled objects
in total with a large variance in scale and appearance. Meanwhile, the dataset
has a rich diversity in season variance, illumination variance, and weather
variance. In addition, a new diverse pedestrian dataset is further built. With
the four different detectors (i.e., the one-stage RetinaNet, anchor-free FCOS,
two-stage FPN, and Cascade R-CNN), experiments about object detection and
pedestrian detection are conducted. We hope that the newly built dataset can
help promote the research on object detection and pedestrian detection in these
two scenes. The dataset is available at https://github.com/tjubiit/TJU-DHD.
Related papers
- XS-VID: An Extremely Small Video Object Detection Dataset [33.62124448175971]
We develop the XS-VID dataset, which comprises aerial data from various periods and scenes, and annotates eight major object categories.
To further evaluate existing methods for detecting extremely small objects, XS-VID extensively collects three types of objects with smaller pixel areas.
We propose YOLOFT, which enhances local feature associations and integrates temporal motion features, significantly improving the accuracy and stability of SVOD.
arXiv Detail & Related papers (2024-07-25T15:42:46Z) - SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection.
Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets.
To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z) - Recurrent Multi-scale Transformer for High-Resolution Salient Object
Detection [68.65338791283298]
Salient Object Detection (SOD) aims to identify and segment the most conspicuous objects in an image or video.
Traditional SOD methods are largely limited to low-resolution images, making them difficult to adapt to the development of High-Resolution SOD.
In this work, we first propose a new HRS10K dataset, which contains 10,500 high-quality annotated images at 2K-8K resolution.
arXiv Detail & Related papers (2023-08-07T17:49:04Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware
Ambidextrous Bin Picking via Physics-based Metaverse Synthesis [72.85526892440251]
We introduce MetaGraspNet, a large-scale photo-realistic bin picking dataset constructed via physics-based metaverse synthesis.
The proposed dataset contains 217k RGBD images across 82 different article types, with full annotations for object detection, amodal perception, keypoint detection, manipulation order and ambidextrous grasp labels for a parallel-jaw and vacuum gripper.
We also provide a real dataset consisting of over 2.3k fully annotated high-quality RGBD images, divided into 5 levels of difficulties and an unseen object set to evaluate different object and layout properties.
arXiv Detail & Related papers (2022-08-08T08:15:34Z) - Remote Sensing Image Super-resolution and Object Detection: Benchmark
and State of the Art [7.74389937337756]
This paper reviews current datasets and object detection methods (deep learning-based) for remote sensing images.
We propose a large-scale, publicly available benchmark Remote Sensing Super-resolution Object Detection dataset.
We also propose a novel Multi-class Cyclic super-resolution Generative adversarial network with Residual feature aggregation (MCGR) and auxiliary YOLOv5 detector to benchmark image super-resolution-based object detection.
arXiv Detail & Related papers (2021-11-05T04:56:34Z) - You Better Look Twice: a new perspective for designing accurate
detectors with reduced computations [56.34005280792013]
BLT-net is a new low-computation two-stage object detection architecture.
It reduces computations by separating objects from background using a very lite first-stage.
Resulting image proposals are then processed in the second-stage by a highly accurate model.
arXiv Detail & Related papers (2021-07-21T12:39:51Z) - FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in
High-Resolution Remote Sensing Imagery [21.9319970004788]
We propose a novel benchmark dataset with more than 1 million instances and more than 15,000 images for Fine-grAined object recognItion in high-Resolution remote sensing imagery.
All objects in the FAIR1M dataset are annotated with respect to 5 categories and 37 sub-categories by oriented bounding boxes.
arXiv Detail & Related papers (2021-03-09T17:20:15Z) - EAGLE: Large-scale Vehicle Detection Dataset in Real-World Scenarios
using Aerial Imagery [3.8902657229395894]
We introduce a large-scale dataset for multi-class vehicle detection with object orientation information in aerial imagery.
It features high-resolution aerial images composed of different real-world situations with a wide variety of camera sensor, resolution, flight altitude, weather, illumination, haze, shadow, time, city, country, occlusion, and camera angle.
It contains 215,986 instances annotated with oriented bounding boxes defined by four points and orientation, making it by far the largest dataset to date in this task.
It also supports researches on the haze and shadow removal as well as super-resolution and in-painting applications.
arXiv Detail & Related papers (2020-07-12T23:00:30Z) - Counting dense objects in remote sensing images [52.182698295053264]
Estimating number of interested objects from a given image is a challenging yet important task.
In this paper, we are interested in counting dense objects from remote sensing images.
To address these issues, we first construct a large-scale object counting dataset based on remote sensing images.
We then benchmark the dataset by designing a novel neural network which can generate density map of an input image.
arXiv Detail & Related papers (2020-02-14T09:13:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.