Shape2.5D: A Dataset of Texture-less Surfaces for Depth and Normals Estimation
- URL: http://arxiv.org/abs/2406.15831v2
- Date: Tue, 05 Nov 2024 09:56:56 GMT
- Title: Shape2.5D: A Dataset of Texture-less Surfaces for Depth and Normals Estimation
- Authors: Muhammad Saif Ullah Khan, Sankalp Sinha, Didier Stricker, Marcus Liwicki, Muhammad Zeshan Afzal,
- Abstract summary: "Shape2.5D" is a novel, large-scale dataset designed to address this gap.
The proposed dataset includes synthetic images rendered with 3D modeling software.
It also includes a real-world subset comprising 4,672 frames captured with a depth camera.
- Score: 12.757150641117077
- License:
- Abstract: Reconstructing texture-less surfaces poses unique challenges in computer vision, primarily due to the lack of specialized datasets that cater to the nuanced needs of depth and normals estimation in the absence of textural information. We introduce "Shape2.5D," a novel, large-scale dataset designed to address this gap. Comprising 1.17 million frames spanning over 39,772 3D models and 48 unique objects, our dataset provides depth and surface normal maps for texture-less object reconstruction. The proposed dataset includes synthetic images rendered with 3D modeling software to simulate various lighting conditions and viewing angles. It also includes a real-world subset comprising 4,672 frames captured with a depth camera. Our comprehensive benchmarks demonstrate the dataset's ability to support the development of algorithms that robustly estimate depth and normals from RGB images and perform voxel reconstruction. Our open-source data generation pipeline allows the dataset to be extended and adapted for future research. The dataset is publicly available at https://github.com/saifkhichi96/Shape25D.
Related papers
- OpenMaterial: A Comprehensive Dataset of Complex Materials for 3D Reconstruction [54.706361479680055]
We introduce the OpenMaterial dataset, comprising 1001 objects made of 295 distinct materials.
OpenMaterial provides comprehensive annotations, including 3D shape, material type, camera pose, depth, and object mask.
It stands as the first large-scale dataset enabling quantitative evaluations of existing algorithms on objects with diverse and challenging materials.
arXiv Detail & Related papers (2024-06-13T07:46:17Z) - Virtually Enriched NYU Depth V2 Dataset for Monocular Depth Estimation: Do We Need Artificial Augmentation? [61.234412062595155]
We present ANYU, a new virtually augmented version of the NYU depth v2 dataset, designed for monocular depth estimation.
In contrast to the well-known approach where full 3D scenes of a virtual world are utilized to generate artificial datasets, ANYU was created by incorporating RGB-D representations of virtual reality objects.
We show that ANYU improves the monocular depth estimation performance and generalization of deep neural networks with considerably different architectures.
arXiv Detail & Related papers (2024-04-15T05:44:03Z) - FOUND: Foot Optimization with Uncertain Normals for Surface Deformation Using Synthetic Data [27.53648027412686]
We seek to develop a method for few-view reconstruction, for the case of the human foot.
To solve this task, we must extract rich geometric cues from RGB images, before carefully fusing them into a final 3D object.
We show that our normal predictor outperforms all off-the-shelf equivalents significantly on real images.
arXiv Detail & Related papers (2023-10-27T17:11:07Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - High-Resolution Synthetic RGB-D Datasets for Monocular Depth Estimation [3.349875948009985]
We generate a high-resolution synthetic depth dataset (HRSD) of dimension 1920 X 1080 from Grand Theft Auto (GTA-V), which contains 100,000 color images and corresponding dense ground truth depth maps.
For experiments and analysis, we train the DPT algorithm, a state-of-the-art transformer-based MDE algorithm on the proposed synthetic dataset, which significantly increases the accuracy of depth maps on different scenes by 9 %.
arXiv Detail & Related papers (2023-05-02T19:03:08Z) - Domain Randomization-Enhanced Depth Simulation and Restoration for
Perceiving and Grasping Specular and Transparent Objects [28.84776177634971]
We propose a powerful RGBD fusion network, SwinDRNet, for depth restoration.
We also propose Domain Randomization-Enhanced Depth Simulation (DREDS) approach to simulate an active stereo depth system.
We show that our depth restoration effectively boosts the performance of downstream tasks.
arXiv Detail & Related papers (2022-08-07T19:17:16Z) - RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis [104.53930611219654]
We present a large-scale synthetic dataset for novel view synthesis consisting of 300k images rendered from nearly 2000 complex scenes.
The dataset is orders of magnitude larger than existing synthetic datasets for novel view synthesis.
Using 4 distinct sources of high-quality 3D meshes, the scenes of our dataset exhibit challenging variations in camera views, lighting, shape, materials, and textures.
arXiv Detail & Related papers (2022-05-14T13:15:32Z) - A Real World Dataset for Multi-view 3D Reconstruction [28.298548207213468]
We present a dataset of 371 3D models of everyday tabletop objects along with their 320,000 real world RGB and depth images.
We primarily focus on learned multi-view 3D reconstruction due to the lack of appropriate real world benchmark for the task and demonstrate that our dataset can fill that gap.
arXiv Detail & Related papers (2022-03-22T00:15:54Z) - Multi-sensor large-scale dataset for multi-view 3D reconstruction [63.59401680137808]
We present a new multi-sensor dataset for multi-view 3D surface reconstruction.
It includes registered RGB and depth data from sensors of different resolutions and modalities: smartphones, Intel RealSense, Microsoft Kinect, industrial cameras, and structured-light scanner.
We provide around 1.4 million images of 107 different scenes acquired from 100 viewing directions under 14 lighting conditions.
arXiv Detail & Related papers (2022-03-11T17:32:27Z) - Ground material classification and for UAV-based photogrammetric 3D data
A 2D-3D Hybrid Approach [1.3359609092684614]
In recent years, photogrammetry has been widely used in many areas to create 3D virtual data representing the physical environment.
These cutting-edge technologies have caught the US Army and Navy's attention for the purpose of rapid 3D battlefield reconstruction, virtual training, and simulations.
arXiv Detail & Related papers (2021-09-24T22:29:26Z) - GeoNet++: Iterative Geometric Neural Network with Edge-Aware Refinement
for Joint Depth and Surface Normal Estimation [204.13451624763735]
We propose a geometric neural network with edge-aware refinement (GeoNet++) to jointly predict both depth and surface normal maps from a single image.
GeoNet++ effectively predicts depth and surface normals with strong 3D consistency and sharp boundaries.
In contrast to current metrics that focus on evaluating pixel-wise error/accuracy, 3DGM measures whether the predicted depth can reconstruct high-quality 3D surface normals.
arXiv Detail & Related papers (2020-12-13T06:48:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.