FC-DCNN: A densely connected neural network for stereo estimation
- URL: http://arxiv.org/abs/2010.06950v1
- Date: Wed, 14 Oct 2020 10:59:09 GMT
- Title: FC-DCNN: A densely connected neural network for stereo estimation
- Authors: Dominik Hirner, Friedrich Fraundorfer
- Abstract summary: Our network consists of a fully-convolutional densely connected neural network (FC-DCNN)
FC-DCNN computes matching costs between rectified image pairs.
We show that our method works well for both challenging indoor and outdoor scenes.
- Score: 14.876446067338406
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel lightweight network for stereo estimation. Our network
consists of a fully-convolutional densely connected neural network (FC-DCNN)
that computes matching costs between rectified image pairs. Our FC-DCNN method
learns expressive features and performs some simple but effective
post-processing steps. The densely connected layer structure connects the
output of each layer to the input of each subsequent layer. This network
structure and the fact that we do not use any fully-connected layers or 3D
convolutions leads to a very lightweight network. The output of this network is
used in order to calculate matching costs and create a cost-volume. Instead of
using time and memory-inefficient cost-aggregation methods such as semi-global
matching or conditional random fields in order to improve the result, we rely
on filtering techniques, namely median filter and guided filter. By computing a
left-right consistency check we get rid of inconsistent values. Afterwards we
use a watershed foreground-background segmentation on the disparity image with
removed inconsistencies. This mask is then used to refine the final prediction.
We show that our method works well for both challenging indoor and outdoor
scenes by evaluating it on the Middlebury, KITTI and ETH3D benchmarks
respectively. Our full framework is available at
https://github.com/thedodo/FC-DCNN
Related papers
- Deep Convolutional Tables: Deep Learning without Convolutions [12.069186324544347]
We propose a novel formulation of deep networks that do not use dot-product neurons and rely on a hierarchy of voting tables instead.
Deep CT networks have been experimentally shown to have accuracy comparable to that of CNNs of similar architectures.
arXiv Detail & Related papers (2023-04-23T17:49:21Z) - FCDSN-DC: An Accurate and Lightweight Convolutional Neural Network for
Stereo Estimation with Depth Completion [14.876446067338406]
We propose an accurate and lightweight convolutional neural network for stereo estimation with depth completion.
We name this method fully-convolutional deformable similarity network with depth completion (FCDSN-DC)
We evaluate our method on challenging real world indoor and outdoor scenes, in particular Middlebury, KITTI and ETH3D were it produces competitive results.
arXiv Detail & Related papers (2022-09-14T09:56:19Z) - Cherry-Picking Gradients: Learning Low-Rank Embeddings of Visual Data
via Differentiable Cross-Approximation [53.95297550117153]
We propose an end-to-end trainable framework that processes large-scale visual data tensors by looking emphat a fraction of their entries only.
The proposed approach is particularly useful for large-scale multidimensional grid data, and for tasks that require context over a large receptive field.
arXiv Detail & Related papers (2021-05-29T08:39:57Z) - Do End-to-end Stereo Algorithms Under-utilize Information? [7.538482310185133]
We show how deep adaptive filtering and differentiable semi-global aggregation can be integrated in 2D and 3D convolutional networks for end-to-end stereo matching.
The improvements are due to utilizing RGB information from the images as a signal to dynamically guide the matching process.
arXiv Detail & Related papers (2020-10-14T18:32:39Z) - Monocular Depth Estimation Using Multi Scale Neural Network And Feature
Fusion [0.0]
Our network uses two different blocks, first which uses different filter sizes for convolution and merges all the individual feature maps.
The second block uses dilated convolutions in place of fully connected layers thus reducing computations and increasing the receptive field.
We train and test our network on Make 3D dataset, NYU Depth V2 dataset and Kitti dataset using standard evaluation metrics for depth estimation comprised of RMSE loss and SILog loss.
arXiv Detail & Related papers (2020-09-11T18:08:52Z) - Distillation Guided Residual Learning for Binary Convolutional Neural
Networks [83.6169936912264]
It is challenging to bridge the performance gap between Binary CNN (BCNN) and Floating point CNN (FCNN)
We observe that, this performance gap leads to substantial residuals between intermediate feature maps of BCNN and FCNN.
To minimize the performance gap, we enforce BCNN to produce similar intermediate feature maps with the ones of FCNN.
This training strategy, i.e., optimizing each binary convolutional block with block-wise distillation loss derived from FCNN, leads to a more effective optimization to BCNN.
arXiv Detail & Related papers (2020-07-10T07:55:39Z) - The Neural Tangent Link Between CNN Denoisers and Non-Local Filters [4.254099382808598]
Convolutional Neural Networks (CNNs) are now a well-established tool for solving computational imaging problems.
We introduce a formal link between such networks through their neural kernel tangent (NTK) and well-known non-local filtering techniques.
We evaluate our findings via extensive image denoising experiments.
arXiv Detail & Related papers (2020-06-03T16:50:54Z) - Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance
Disparity Estimation [51.17232267143098]
We propose a novel system named Disp R-CNN for 3D object detection from stereo images.
We use a statistical shape model to generate dense disparity pseudo-ground-truth without the need of LiDAR point clouds.
Experiments on the KITTI dataset show that, even when LiDAR ground-truth is not available at training time, Disp R-CNN achieves competitive performance and outperforms previous state-of-the-art methods by 20% in terms of average precision.
arXiv Detail & Related papers (2020-04-07T17:48:45Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z) - R-FCN: Object Detection via Region-based Fully Convolutional Networks [87.62557357527861]
We present region-based, fully convolutional networks for accurate and efficient object detection.
Our result is achieved at a test-time speed of 170ms per image, 2.5-20x faster than the Faster R-CNN counterpart.
arXiv Detail & Related papers (2016-05-20T15:50:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.