Open Challenges in Deep Stereo: the Booster Dataset
- URL: http://arxiv.org/abs/2206.04671v1
- Date: Thu, 9 Jun 2022 17:59:56 GMT
- Title: Open Challenges in Deep Stereo: the Booster Dataset
- Authors: Pierluigi Zama Ramirez, Fabio Tosi, Matteo Poggi, Samuele Salti,
Stefano Mattoccia, Luigi Di Stefano
- Abstract summary: We present a novel high-resolution and challenging stereo dataset framing indoor scenes annotated with dense and accurate ground-truth disparities.
Peculiar to our dataset is the presence of several specular and transparent surfaces.
We release a total of 419 samples collected in 64 different scenes and annotated with dense ground-truth disparities.
- Score: 49.28588927121722
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel high-resolution and challenging stereo dataset framing
indoor scenes annotated with dense and accurate ground-truth disparities.
Peculiar to our dataset is the presence of several specular and transparent
surfaces, i.e. the main causes of failures for state-of-the-art stereo
networks. Our acquisition pipeline leverages a novel deep space-time stereo
framework which allows for easy and accurate labeling with sub-pixel precision.
We release a total of 419 samples collected in 64 different scenes and
annotated with dense ground-truth disparities. Each sample include a
high-resolution pair (12 Mpx) as well as an unbalanced pair (Left: 12 Mpx,
Right: 1.1 Mpx). Additionally, we provide manually annotated material
segmentation masks and 15K unlabeled samples. We evaluate state-of-the-art deep
networks based on our dataset, highlighting their limitations in addressing the
open challenges in stereo and drawing hints for future research.
Related papers
- Depth-aware Volume Attention for Texture-less Stereo Matching [67.46404479356896]
We propose a lightweight volume refinement scheme to tackle the texture deterioration in practical outdoor scenarios.
We introduce a depth volume supervised by the ground-truth depth map, capturing the relative hierarchy of image texture.
Local fine structure and context are emphasized to mitigate ambiguity and redundancy during volume aggregation.
arXiv Detail & Related papers (2024-02-14T04:07:44Z) - High-Resolution Synthetic RGB-D Datasets for Monocular Depth Estimation [3.349875948009985]
We generate a high-resolution synthetic depth dataset (HRSD) of dimension 1920 X 1080 from Grand Theft Auto (GTA-V), which contains 100,000 color images and corresponding dense ground truth depth maps.
For experiments and analysis, we train the DPT algorithm, a state-of-the-art transformer-based MDE algorithm on the proposed synthetic dataset, which significantly increases the accuracy of depth maps on different scenes by 9 %.
arXiv Detail & Related papers (2023-05-02T19:03:08Z) - Booster: a Benchmark for Depth from Images of Specular and Transparent
Surfaces [49.44971010149331]
We propose a novel dataset that includes accurate and dense ground-truth labels at high resolution.
Our acquisition pipeline leverages a novel deep space-time stereo framework.
The dataset is composed of 606 samples collected in 85 different scenes.
arXiv Detail & Related papers (2023-01-19T18:59:28Z) - iNNformant: Boundary Samples as Telltale Watermarks [68.8204255655161]
We show that it is possible to generate sets of boundary samples which can identify any of four tested microarchitectures.
These sets can be built to not contain any sample with a worse peak signal-to-noise ratio than 70dB.
arXiv Detail & Related papers (2021-06-14T11:18:32Z) - SMD-Nets: Stereo Mixture Density Networks [68.56947049719936]
We propose Stereo Mixture Density Networks (SMD-Nets), a simple yet effective learning framework compatible with a wide class of 2D and 3D architectures.
Specifically, we exploit bimodal mixture densities as output representation and show that this allows for sharp and precise disparity estimates near discontinuities.
We carry out comprehensive experiments on a new high-resolution and highly realistic synthetic stereo dataset, consisting of stereo pairs at 8Mpx resolution, as well as on real-world stereo datasets.
arXiv Detail & Related papers (2021-04-08T16:15:46Z) - LIGHTS: LIGHT Specularity Dataset for specular detection in Multi-view [12.612981566441908]
We propose a novel physically-based rendered LIGHT Specularity (SLIGHT) dataset for the evaluation of the specular highlight detection task.
Our dataset consists of 18 high quality architectural scenes, where each scene is rendered with multiple views.
In total we have 2,603 views with an average of 145 views per scene.
arXiv Detail & Related papers (2021-01-26T13:26:49Z) - A Flow Base Bi-path Network for Cross-scene Video Crowd Understanding in
Aerial View [93.23947591795897]
In this paper, we strive to tackle the challenges and automatically understand the crowd from the visual data collected from drones.
To alleviate the background noise generated in cross-scene testing, a double-stream crowd counting model is proposed.
To tackle the crowd density estimation problem under extreme dark environments, we introduce synthetic data generated by game Grand Theft Auto V(GTAV)
arXiv Detail & Related papers (2020-09-29T01:48:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.