Does it work outside this benchmark? Introducing the Rigid Depth
Constructor tool, depth validation dataset construction in rigid scenes for
the masses
- URL: http://arxiv.org/abs/2103.15970v1
- Date: Mon, 29 Mar 2021 22:01:24 GMT
- Title: Does it work outside this benchmark? Introducing the Rigid Depth
Constructor tool, depth validation dataset construction in rigid scenes for
the masses
- Authors: Cl\'ement Pinard, Antoine Manzanera
- Abstract summary: We present a protocol to construct your own depth validation dataset for navigation.
RDC for Rigid Depth Constructor aims at being more accessible and cheaper than already existing techniques.
We also develop a test suite to get insightful information from the evaluated algorithm.
- Score: 1.294486861344922
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We present a protocol to construct your own depth validation dataset for
navigation. This protocol, called RDC for Rigid Depth Constructor, aims at
being more accessible and cheaper than already existing techniques, requiring
only a camera and a Lidar sensor to get started. We also develop a test suite
to get insightful information from the evaluated algorithm. Finally, we take
the example of UAV videos, on which we test two depth algorithms that were
initially tested on KITTI and show that the drone context is dramatically
different from in-car videos. This shows that a single context benchmark should
not be considered reliable, and when developing a depth estimation algorithm,
one should benchmark it on a dataset that best fits one's particular needs,
which often means creating a brand new one. Along with this paper we provide
the tool with an open source implementation and plan to make it as
user-friendly as possible, to make depth dataset creation possible even for
small teams. Our key contributions are the following: We propose a complete,
open-source and almost fully automatic software application for creating
validation datasets with densely annotated depth, adaptable to a wide variety
of image, video and range data. It includes selection tools to adapt the
dataset to specific validation needs, and conversion tools to other dataset
formats. Using this application, we propose two new real datasets, outdoor and
indoor, readily usable in UAV navigation context. Finally as examples, we show
an evaluation of two depth prediction algorithms, using a collection of
comprehensive (e.g. distribution based) metrics.
Related papers
- Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data [87.61900472933523]
This work presents Depth Anything, a highly practical solution for robust monocular depth estimation.
We scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data.
We evaluate its zero-shot capabilities extensively, including six public datasets and randomly captured photos.
arXiv Detail & Related papers (2024-01-19T18:59:52Z) - Metrically Scaled Monocular Depth Estimation through Sparse Priors for
Underwater Robots [0.0]
We formulate a deep learning model that fuses sparse depth measurements from triangulated features to improve the depth predictions.
The network is trained in a supervised fashion on the forward-looking underwater dataset, FLSea.
The method achieves real-time performance, running at 160 FPS on a laptop GPU and 7 FPS on a single CPU core.
arXiv Detail & Related papers (2023-10-25T16:32:31Z) - V-DETR: DETR with Vertex Relative Position Encoding for 3D Object
Detection [73.37781484123536]
We introduce a highly performant 3D object detector for point clouds using the DETR framework.
To address the limitation, we introduce a novel 3D Relative Position (3DV-RPE) method.
We show exceptional results on the challenging ScanNetV2 benchmark.
arXiv Detail & Related papers (2023-08-08T17:14:14Z) - NVDS+: Towards Efficient and Versatile Neural Stabilizer for Video Depth Estimation [58.21817572577012]
Video depth estimation aims to infer temporally consistent depth.
We introduce NVDS+ that stabilizes inconsistent depth estimated by various single-image models in a plug-and-play manner.
We also elaborate a large-scale Video Depth in the Wild dataset, which contains 14,203 videos with over two million frames.
arXiv Detail & Related papers (2023-07-17T17:57:01Z) - Lightweight Monocular Depth Estimation with an Edge Guided Network [34.03711454383413]
We present a novel lightweight Edge Guided Depth Estimation Network (EGD-Net)
In particular, we start out with a lightweight encoder-decoder architecture and embed an edge guidance branch.
In order to aggregate the context information and edge attention features, we design a transformer-based feature aggregation module.
arXiv Detail & Related papers (2022-09-29T14:45:47Z) - A Benchmark and a Baseline for Robust Multi-view Depth Estimation [36.02034260946296]
deep learning approaches for multi-view depth estimation are employed either in a depth-from-video or a multi-view stereo setting.
We introduce the Robust Multi-View Depth Benchmark that is built upon a set of public datasets.
We show that recent approaches do not generalize across datasets in this setting.
We present the Robust MVD Baseline model for multi-view depth estimation, which is built upon existing components but employs a novel scale augmentation procedure.
arXiv Detail & Related papers (2022-09-13T17:44:16Z) - 360 Depth Estimation in the Wild -- The Depth360 Dataset and the SegFuse
Network [35.03201732370496]
Single-view depth estimation from omnidirectional images has gained popularity with its wide range of applications such as autonomous driving and scene reconstruction.
In this work, we first establish a large-scale dataset with varied settings called Depth360 to tackle the training data problem.
We then propose an end-to-end two-branch multi-task learning network, SegFuse, that mimics the human eye to effectively learn from the dataset.
arXiv Detail & Related papers (2022-02-16T11:56:31Z) - Are we ready for beyond-application high-volume data? The Reeds robot
perception benchmark dataset [3.781421673607643]
This paper presents a dataset, called Reeds, for research on robot perception algorithms.
The dataset aims to provide demanding benchmark opportunities for algorithms, rather than providing an environment for testing application-specific solutions.
arXiv Detail & Related papers (2021-09-16T23:21:42Z) - Sparse Auxiliary Networks for Unified Monocular Depth Prediction and
Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars.
In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors.
We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision.
In this work, we rely, instead of different views, on depth from focus cues.
We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.