SynDrone -- Multi-modal UAV Dataset for Urban Scenarios
- URL: http://arxiv.org/abs/2308.10491v1
- Date: Mon, 21 Aug 2023 06:22:10 GMT
- Title: SynDrone -- Multi-modal UAV Dataset for Urban Scenarios
- Authors: Giulia Rizzoli, Francesco Barbato, Matteo Caligiuri, Pietro Zanuttigh
- Abstract summary: The scarcity of large-scale real datasets with pixel-level annotations poses a significant challenge to researchers.
We propose a multimodal synthetic dataset containing both images and 3D data taken at multiple flying heights.
The dataset will be made publicly available to support the development of novel computer vision methods targeting UAV applications.
- Score: 11.338399194998933
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The development of computer vision algorithms for Unmanned Aerial Vehicles
(UAVs) imagery heavily relies on the availability of annotated high-resolution
aerial data. However, the scarcity of large-scale real datasets with
pixel-level annotations poses a significant challenge to researchers as the
limited number of images in existing datasets hinders the effectiveness of deep
learning models that require a large amount of training data. In this paper, we
propose a multimodal synthetic dataset containing both images and 3D data taken
at multiple flying heights to address these limitations. In addition to
object-level annotations, the provided data also include pixel-level labeling
in 28 classes, enabling exploration of the potential advantages in tasks like
semantic segmentation. In total, our dataset contains 72k labeled samples that
allow for effective training of deep architectures showing promising results in
synthetic-to-real adaptation. The dataset will be made publicly available to
support the development of novel computer vision methods targeting UAV
applications.
Related papers
- UAV (Unmanned Aerial Vehicles): Diverse Applications of UAV Datasets in Segmentation, Classification, Detection, and Tracking [0.0]
Unmanned Aerial Vehicles (UAVs) have revolutionized the process of gathering and analyzing data in diverse research domains.
UAV datasets consist of various types of data, such as satellite imagery, images captured by drones, and videos.
These datasets play a crucial role in disaster damage assessment, aerial surveillance, object recognition, and tracking.
arXiv Detail & Related papers (2024-09-05T04:47:36Z) - Plain-Det: A Plain Multi-Dataset Object Detector [22.848784430833835]
Plain-Det offers flexibility to accommodate new datasets, in performance across diverse datasets, and training efficiency.
We conduct extensive experiments on 13 downstream datasets and Plain-Det demonstrates strong generalization capability.
arXiv Detail & Related papers (2024-07-14T05:18:06Z) - Rethinking Transformers Pre-training for Multi-Spectral Satellite
Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks.
Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data.
In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - Large Scale Real-World Multi-Person Tracking [68.27438015329807]
This paper presents a new large scale multi-person tracking dataset -- textttPersonPath22.
It is over an order of magnitude larger than currently available high quality multi-object tracking datasets such as MOT17, HiEve, and MOT20.
arXiv Detail & Related papers (2022-11-03T23:03:13Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware
Ambidextrous Bin Picking via Physics-based Metaverse Synthesis [72.85526892440251]
We introduce MetaGraspNet, a large-scale photo-realistic bin picking dataset constructed via physics-based metaverse synthesis.
The proposed dataset contains 217k RGBD images across 82 different article types, with full annotations for object detection, amodal perception, keypoint detection, manipulation order and ambidextrous grasp labels for a parallel-jaw and vacuum gripper.
We also provide a real dataset consisting of over 2.3k fully annotated high-quality RGBD images, divided into 5 levels of difficulties and an unseen object set to evaluate different object and layout properties.
arXiv Detail & Related papers (2022-08-08T08:15:34Z) - On The State of Data In Computer Vision: Human Annotations Remain
Indispensable for Developing Deep Learning Models [0.0]
High-quality labeled datasets play a crucial role in fueling the development of machine learning (ML)
Since the emergence of the ImageNet dataset and the AlexNet model in 2012, the size of new open-source labeled vision datasets has remained roughly constant.
Only a minority of publications in the computer vision community tackle supervised learning on datasets that are orders of magnitude larger than Imagenet.
arXiv Detail & Related papers (2021-07-31T00:08:21Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Dense Multiscale Feature Fusion Pyramid Networks for Object Detection in
UAV-Captured Images [0.09065034043031667]
We propose a novel method called Dense Multiscale Feature Fusion Pyramid Networks(DMFFPN), which is aimed at obtaining rich features as much as possible.
Specifically, the dense connection is designed to fully utilize the representation from the different convolutional layers.
Experiments on the drone-based datasets named VisDrone-DET suggest a competitive performance of our method.
arXiv Detail & Related papers (2020-12-19T10:05:31Z) - Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical
Understanding of Outdoor Scene [76.4183572058063]
We present a richly-annotated 3D point cloud dataset for multiple outdoor scene understanding tasks.
The dataset has been point-wisely annotated with both hierarchical and instance-based labels.
We formulate a hierarchical learning problem for 3D point cloud segmentation and propose a measurement evaluating consistency across various hierarchies.
arXiv Detail & Related papers (2020-08-11T19:10:32Z) - Deflating Dataset Bias Using Synthetic Data Augmentation [8.509201763744246]
State-of-the-art methods for most vision tasks for Autonomous Vehicles (AVs) rely on supervised learning.
The goal of this paper is to investigate the use of targeted synthetic data augmentation for filling gaps in real datasets for vision tasks.
Empirical studies on three different computer vision tasks of practical use to AVs consistently show that having synthetic data in the training mix provides a significant boost in cross-dataset generalization performance.
arXiv Detail & Related papers (2020-04-28T21:56:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.