FCDSN-DC: An Accurate and Lightweight Convolutional Neural Network for
Stereo Estimation with Depth Completion
- URL: http://arxiv.org/abs/2209.06525v1
- Date: Wed, 14 Sep 2022 09:56:19 GMT
- Title: FCDSN-DC: An Accurate and Lightweight Convolutional Neural Network for
Stereo Estimation with Depth Completion
- Authors: Dominik Hirner, Friedrich Fraundorfer
- Abstract summary: We propose an accurate and lightweight convolutional neural network for stereo estimation with depth completion.
We name this method fully-convolutional deformable similarity network with depth completion (FCDSN-DC)
We evaluate our method on challenging real world indoor and outdoor scenes, in particular Middlebury, KITTI and ETH3D were it produces competitive results.
- Score: 14.876446067338406
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose an accurate and lightweight convolutional neural network for
stereo estimation with depth completion. We name this method
fully-convolutional deformable similarity network with depth completion
(FCDSN-DC). This method extends FC-DCNN by improving the feature extractor,
adding a network structure for training highly accurate similarity functions
and a network structure for filling inconsistent disparity estimates. The whole
method consists of three parts. The first part consists of fully-convolutional
densely connected layers that computes expressive features of rectified image
pairs. The second part of our network learns highly accurate similarity
functions between this learned features. It consists of densely-connected
convolution layers with a deformable convolution block at the end to further
improve the accuracy of the results. After this step an initial disparity map
is created and the left-right consistency check is performed in order to remove
inconsistent points. The last part of the network then uses this input together
with the corresponding left RGB image in order to train a network that fills in
the missing measurements. Consistent depth estimations are gathered around
invalid points and are parsed together with the RGB points into a shallow CNN
network structure in order to recover the missing values. We evaluate our
method on challenging real world indoor and outdoor scenes, in particular
Middlebury, KITTI and ETH3D were it produces competitive results. We
furthermore show that this method generalizes well and is well suited for many
applications without the need of further training. The code of our full
framework is available at: https://github.com/thedodo/FCDSN-DC
Related papers
- SVNet: Where SO(3) Equivariance Meets Binarization on Point Cloud
Representation [65.4396959244269]
The paper tackles the challenge by designing a general framework to construct 3D learning architectures.
The proposed approach can be applied to general backbones like PointNet and DGCNN.
Experiments on ModelNet40, ShapeNet, and the real-world dataset ScanObjectNN, demonstrated that the method achieves a great trade-off between efficiency, rotation, and accuracy.
arXiv Detail & Related papers (2022-09-13T12:12:19Z) - What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime.
We prove that deep CNNs adapt to the spatial scale of the target function.
We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z) - Focal Sparse Convolutional Networks for 3D Object Detection [121.45950754511021]
We introduce two new modules to enhance the capability of Sparse CNNs.
They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion.
For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection.
arXiv Detail & Related papers (2022-04-26T17:34:10Z) - GCNDepth: Self-supervised Monocular Depth Estimation based on Graph
Convolutional Network [11.332580333969302]
This work brings a new solution with a set of improvements, which increase the quantitative and qualitative understanding of depth maps.
A graph convolutional network (GCN) can handle the convolution on non-Euclidean data and it can be applied to irregular image regions within a topological structure.
Our method provided comparable and promising results with a high prediction accuracy of 89% on the publicly KITTI and Make3D datasets.
arXiv Detail & Related papers (2021-12-13T16:46:25Z) - Learning Joint 2D-3D Representations for Depth Completion [90.62843376586216]
We design a simple yet effective neural network block that learns to extract joint 2D and 3D features.
Specifically, the block consists of two domain-specific sub-networks that apply 2D convolution on image pixels and continuous convolution on 3D points.
arXiv Detail & Related papers (2020-12-22T22:58:29Z) - Do End-to-end Stereo Algorithms Under-utilize Information? [7.538482310185133]
We show how deep adaptive filtering and differentiable semi-global aggregation can be integrated in 2D and 3D convolutional networks for end-to-end stereo matching.
The improvements are due to utilizing RGB information from the images as a signal to dynamically guide the matching process.
arXiv Detail & Related papers (2020-10-14T18:32:39Z) - FC-DCNN: A densely connected neural network for stereo estimation [14.876446067338406]
Our network consists of a fully-convolutional densely connected neural network (FC-DCNN)
FC-DCNN computes matching costs between rectified image pairs.
We show that our method works well for both challenging indoor and outdoor scenes.
arXiv Detail & Related papers (2020-10-14T10:59:09Z) - Monocular Depth Estimation Using Multi Scale Neural Network And Feature
Fusion [0.0]
Our network uses two different blocks, first which uses different filter sizes for convolution and merges all the individual feature maps.
The second block uses dilated convolutions in place of fully connected layers thus reducing computations and increasing the receptive field.
We train and test our network on Make 3D dataset, NYU Depth V2 dataset and Kitti dataset using standard evaluation metrics for depth estimation comprised of RMSE loss and SILog loss.
arXiv Detail & Related papers (2020-09-11T18:08:52Z) - Learning Local Neighboring Structure for Robust 3D Shape Representation [143.15904669246697]
Representation learning for 3D meshes is important in many computer vision and graphics applications.
We propose a local structure-aware anisotropic convolutional operation (LSA-Conv)
Our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-21T13:40:03Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - DELTAS: Depth Estimation by Learning Triangulation And densification of
Sparse points [14.254472131009653]
Multi-view stereo (MVS) is the golden mean between the accuracy of active depth sensing and the practicality of monocular depth estimation.
Cost volume based approaches employing 3D convolutional neural networks (CNNs) have considerably improved the accuracy of MVS systems.
We propose an efficient depth estimation approach by first (a) detecting and evaluating descriptors for interest points, then (b) learning to match and triangulate a small set of interest points, and finally (c) densifying this sparse set of 3D points using CNNs.
arXiv Detail & Related papers (2020-03-19T17:56:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.