A Survey on RGB-D Datasets
- URL: http://arxiv.org/abs/2201.05761v1
- Date: Sat, 15 Jan 2022 05:35:19 GMT
- Title: A Survey on RGB-D Datasets
- Authors: Alexandre Lopes, Roberto Souza, Helio Pedrini
- Abstract summary: This paper reviewed and categorized image datasets that include depth information.
We gathered 203 datasets that contain accessible data and grouped them into three categories: scene/objects, body, and medical.
- Score: 69.73803123972297
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: RGB-D data is essential for solving many problems in computer vision.
Hundreds of public RGB-D datasets containing various scenes, such as indoor,
outdoor, aerial, driving, and medical, have been proposed. These datasets are
useful for different applications and are fundamental for addressing classic
computer vision tasks, such as monocular depth estimation. This paper reviewed
and categorized image datasets that include depth information. We gathered 203
datasets that contain accessible data and grouped them into three categories:
scene/objects, body, and medical. We also provided an overview of the different
types of sensors, depth applications, and we examined trends and future
directions of the usage and creation of datasets containing depth data, and how
they can be applied to investigate the development of generalizable machine
learning models in the monocular depth estimation field.
Related papers
- Space3D-Bench: Spatial 3D Question Answering Benchmark [49.259397521459114]
We present Space3D-Bench - a collection of 1000 general spatial questions and answers related to scenes of the Replica dataset.
We provide an assessment system that grades natural language responses based on predefined ground-truth answers.
Finally, we introduce a baseline called RAG3D-Chat integrating the world understanding of foundation models with rich context retrieval.
arXiv Detail & Related papers (2024-08-29T16:05:22Z) - SynDrone -- Multi-modal UAV Dataset for Urban Scenarios [11.338399194998933]
The scarcity of large-scale real datasets with pixel-level annotations poses a significant challenge to researchers.
We propose a multimodal synthetic dataset containing both images and 3D data taken at multiple flying heights.
The dataset will be made publicly available to support the development of novel computer vision methods targeting UAV applications.
arXiv Detail & Related papers (2023-08-21T06:22:10Z) - IDD-3D: Indian Driving Dataset for 3D Unstructured Road Scenes [79.18349050238413]
Preparation and training of deploy-able deep learning architectures require the models to be suited to different traffic scenarios.
An unstructured and complex driving layout found in several developing countries such as India poses a challenge to these models.
We build a new dataset, IDD-3D, which consists of multi-modal data from multiple cameras and LiDAR sensors with 12k annotated driving LiDAR frames.
arXiv Detail & Related papers (2022-10-23T23:03:17Z) - Deep Depth Completion: A Survey [26.09557446012222]
We provide a comprehensive literature review that helps readers better grasp the research trends and clearly understand the current advances.
We investigate the related studies from the design aspects of network architectures, loss functions, benchmark datasets, and learning strategies.
We present a quantitative comparison of model performance on two widely used benchmark datasets, including an indoor and an outdoor dataset.
arXiv Detail & Related papers (2022-05-11T08:24:00Z) - Multi-sensor large-scale dataset for multi-view 3D reconstruction [63.59401680137808]
We present a new multi-sensor dataset for multi-view 3D surface reconstruction.
It includes registered RGB and depth data from sensors of different resolutions and modalities: smartphones, Intel RealSense, Microsoft Kinect, industrial cameras, and structured-light scanner.
We provide around 1.4 million images of 107 different scenes acquired from 100 viewing directions under 14 lighting conditions.
arXiv Detail & Related papers (2022-03-11T17:32:27Z) - Do Datasets Have Politics? Disciplinary Values in Computer Vision
Dataset Development [6.182409582844314]
We collect a corpus of about 500 computer vision datasets, from which we sampled 114 dataset publications across different vision tasks.
We discuss how computer vision datasets authors value efficiency at the expense of care; universality at the expense of contextuality; and model work at the expense of data work.
We conclude with suggestions on how to better incorporate silenced values into the dataset creation and curation process.
arXiv Detail & Related papers (2021-08-09T19:07:58Z) - REGRAD: A Large-Scale Relational Grasp Dataset for Safe and
Object-Specific Robotic Grasping in Clutter [52.117388513480435]
We present a new dataset named regrad to sustain the modeling of relationships among objects and grasps.
Our dataset is collected in both forms of 2D images and 3D point clouds.
Users are free to import their own object models for the generation of as many data as they want.
arXiv Detail & Related papers (2021-04-29T05:31:21Z) - Light Field Salient Object Detection: A Review and Benchmark [37.28938750278883]
This paper provides the first comprehensive review and benchmark for light field SOD.
It covers ten traditional models, seven deep learning-based models, one comparative study, and one brief review.
We benchmark nine representative light field SOD models together with several cutting-edge RGB-D SOD models on four widely used light field datasets.
arXiv Detail & Related papers (2020-10-10T10:30:40Z) - RGB-D Salient Object Detection: A Survey [195.83586883670358]
We provide a comprehensive survey of RGB-D based SOD models from various perspectives.
We also review SOD models and popular benchmark datasets from this domain.
We discuss several challenges and open directions of RGB-D based SOD for future research.
arXiv Detail & Related papers (2020-08-01T10:01:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.