Image-Coupled Volume Propagation for Stereo Matching
- URL: http://arxiv.org/abs/2301.00695v1
- Date: Fri, 30 Dec 2022 13:23:25 GMT
- Title: Image-Coupled Volume Propagation for Stereo Matching
- Authors: Oh-Hun Kwon, Eduard Zell
- Abstract summary: We propose a new way to process the 4D cost volume where we merge two different concepts in one framework to achieve a symbiotic relationship.
A feature matching part is responsible for identifying matching pixels pairs along the baseline while a concurrent image volume part is inspired by depth-from-mono CNNs.
Our end-to-end trained CNN is ranked 2nd on KITTI2012 and ETH3D benchmarks while being significantly faster than the 1st-ranked method.
- Score: 0.24366811507669117
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Several leading methods on public benchmarks for depth-from-stereo rely on
memory-demanding 4D cost volumes and computationally intensive 3D convolutions
for feature matching. We suggest a new way to process the 4D cost volume where
we merge two different concepts in one deeply integrated framework to achieve a
symbiotic relationship. A feature matching part is responsible for identifying
matching pixels pairs along the baseline while a concurrent image volume part
is inspired by depth-from-mono CNNs. However, instead of predicting depth
directly from image features, it provides additional context to resolve
ambiguities during pixel matching. More technically, the processing of the 4D
cost volume is separated into a 2D propagation and a 3D propagation part.
Starting from feature maps of the left image, the 2D propagation assists the 3D
propagation part of the cost volume at different layers by adding visual
features to the geometric context. By combining both parts, we can safely
reduce the scale of 3D convolution layers in the matching part without
sacrificing accuracy. Experiments demonstrate that our end-to-end trained CNN
is ranked 2nd on KITTI2012 and ETH3D benchmarks while being significantly
faster than the 1st-ranked method. Furthermore, we notice that the coupling of
image and matching-volume improves fine-scale details as demonstrated by our
qualitative analysis.
Related papers
- ImplicitVol: Sensorless 3D Ultrasound Reconstruction with Deep Implicit
Representation [13.71137201718831]
The objective of this work is to achieve sensorless reconstruction of a 3D volume from a set of 2D freehand ultrasound images with deep implicit representation.
In contrast to the conventional way that represents a 3D volume as a discrete voxel grid, we do so by parameterizing it as the zero level-set of a continuous function.
Our proposed model, as ImplicitVol, takes a set of 2D scans and their estimated locations in 3D as input, jointly re?fing the estimated 3D locations and learning a full reconstruction of the 3D volume.
arXiv Detail & Related papers (2021-09-24T17:59:18Z) - Bidirectional Projection Network for Cross Dimension Scene Understanding [69.29443390126805]
We present a emphbidirectional projection network (BPNet) for joint 2D and 3D reasoning in an end-to-end manner.
Via the emphBPM, complementary 2D and 3D information can interact with each other in multiple architectural levels.
Our emphBPNet achieves top performance on the ScanNetV2 benchmark for both 2D and 3D semantic segmentation.
arXiv Detail & Related papers (2021-03-26T08:31:39Z) - Stereo Object Matching Network [78.35697025102334]
This paper presents a stereo object matching method that exploits both 2D contextual information from images and 3D object-level information.
We present two novel strategies to handle 3D objectness in the cost volume space: selective sampling (RoISelect) and 2D-3D fusion.
arXiv Detail & Related papers (2021-03-23T12:54:43Z) - Learning Joint 2D-3D Representations for Depth Completion [90.62843376586216]
We design a simple yet effective neural network block that learns to extract joint 2D and 3D features.
Specifically, the block consists of two domain-specific sub-networks that apply 2D convolution on image pixels and continuous convolution on 3D points.
arXiv Detail & Related papers (2020-12-22T22:58:29Z) - Spatial Context-Aware Self-Attention Model For Multi-Organ Segmentation [18.76436457395804]
Multi-organ segmentation is one of most successful applications of deep learning in medical image analysis.
Deep convolutional neural nets (CNNs) have shown great promise in achieving clinically applicable image segmentation performance on CT or MRI images.
We propose a new framework for combining 3D and 2D models, in which the segmentation is realized through high-resolution 2D convolutions.
arXiv Detail & Related papers (2020-12-16T21:39:53Z) - Displacement-Invariant Cost Computation for Efficient Stereo Matching [122.94051630000934]
Deep learning methods have dominated stereo matching leaderboards by yielding unprecedented disparity accuracy.
But their inference time is typically slow, on the order of seconds for a pair of 540p images.
We propose a emphdisplacement-invariant cost module to compute the matching costs without needing a 4D feature volume.
arXiv Detail & Related papers (2020-12-01T23:58:16Z) - Displacement-Invariant Matching Cost Learning for Accurate Optical Flow
Estimation [109.64756528516631]
Learning matching costs have been shown to be critical to the success of the state-of-the-art deep stereo matching methods.
This paper proposes a novel solution that is able to bypass the requirement of building a 5D feature volume.
Our approach achieves state-of-the-art accuracy on various datasets, and outperforms all published optical flow methods on the Sintel benchmark.
arXiv Detail & Related papers (2020-10-28T09:57:00Z) - Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from
Single and Multiple Images [56.652027072552606]
We propose a novel framework for single-view and multi-view 3D object reconstruction, named Pix2Vox++.
By using a well-designed encoder-decoder, it generates a coarse 3D volume from each input image.
A multi-scale context-aware fusion module is then introduced to adaptively select high-quality reconstructions for different parts from all coarse 3D volumes to obtain a fused 3D volume.
arXiv Detail & Related papers (2020-06-22T13:48:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.