MEStereo-Du2CNN: A Novel Dual Channel CNN for Learning Robust Depth
Estimates from Multi-exposure Stereo Images for HDR 3D Applications
- URL: http://arxiv.org/abs/2206.10375v1
- Date: Tue, 21 Jun 2022 13:23:22 GMT
- Title: MEStereo-Du2CNN: A Novel Dual Channel CNN for Learning Robust Depth
Estimates from Multi-exposure Stereo Images for HDR 3D Applications
- Authors: Rohit Choudhary and Mansi Sharma and Uma T V and Rithvik Anil
- Abstract summary: We develop a novel deep architecture for multi-exposure stereo depth estimation.
For the stereo depth estimation component of our architecture, a mono-to-stereo transfer learning approach is deployed.
In terms of performance, the proposed model surpasses state-of-the-art monocular and stereo depth estimation methods.
- Score: 0.22940141855172028
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Display technologies have evolved over the years. It is critical to develop
practical HDR capturing, processing, and display solutions to bring 3D
technologies to the next level. Depth estimation of multi-exposure stereo image
sequences is an essential task in the development of cost-effective 3D HDR
video content. In this paper, we develop a novel deep architecture for
multi-exposure stereo depth estimation. The proposed architecture has two novel
components. First, the stereo matching technique used in traditional stereo
depth estimation is revamped. For the stereo depth estimation component of our
architecture, a mono-to-stereo transfer learning approach is deployed. The
proposed formulation circumvents the cost volume construction requirement,
which is replaced by a ResNet based dual-encoder single-decoder CNN with
different weights for feature fusion. EfficientNet based blocks are used to
learn the disparity. Secondly, we combine disparity maps obtained from the
stereo images at different exposure levels using a robust disparity feature
fusion approach. The disparity maps obtained at different exposures are merged
using weight maps calculated for different quality measures. The final
predicted disparity map obtained is more robust and retains best features that
preserve the depth discontinuities. The proposed CNN offers flexibility to
train using standard dynamic range stereo data or with multi-exposure low
dynamic range stereo sequences. In terms of performance, the proposed model
surpasses state-of-the-art monocular and stereo depth estimation methods, both
quantitatively and qualitatively, on challenging Scene flow and differently
exposed Middlebury stereo datasets. The architecture performs exceedingly well
on complex natural scenes, demonstrating its usefulness for diverse 3D HDR
applications.
Related papers
- SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception.
These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image.
We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z) - R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras [106.52409577316389]
R3D3 is a multi-camera system for dense 3D reconstruction and ego-motion estimation.
Our approach exploits spatial-temporal information from multiple cameras, and monocular depth refinement.
We show that this design enables a dense, consistent 3D reconstruction of challenging, dynamic outdoor environments.
arXiv Detail & Related papers (2023-08-28T17:13:49Z) - DynamicStereo: Consistent Dynamic Depth from Stereo Videos [91.1804971397608]
We propose DynamicStereo to estimate disparity for stereo videos.
The network learns to pool information from neighboring frames to improve the temporal consistency of its predictions.
We also introduce Dynamic Replica, a new benchmark dataset containing synthetic videos of people and animals in scanned environments.
arXiv Detail & Related papers (2023-05-03T17:40:49Z) - DiffuStereo: High Quality Human Reconstruction via Diffusion-based
Stereo Using Sparse Cameras [33.6247548142638]
We propose DiffuStereo, a novel system using only sparse cameras for high-quality 3D human reconstruction.
At its core is a novel diffusion-based stereo module, which introduces diffusion models into the iterative stereo matching network.
We present a multi-level stereo network architecture to handle high-resolution (up to 4k) inputs without requiring unaffordable memory footprint.
arXiv Detail & Related papers (2022-07-16T19:08:18Z) - DSGN++: Exploiting Visual-Spatial Relation forStereo-based 3D Detectors [60.88824519770208]
Camera-based 3D object detectors are welcome due to their wider deployment and lower price than LiDAR sensors.
We revisit the prior stereo modeling DSGN about the stereo volume constructions for representing both 3D geometry and semantics.
We propose our approach, DSGN++, aiming for improving information flow throughout the 2D-to-3D pipeline.
arXiv Detail & Related papers (2022-04-06T18:43:54Z) - Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo [103.08512487830669]
We present a modern solution to the multi-view photometric stereo problem (MVPS)
We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry.
Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.
arXiv Detail & Related papers (2021-10-11T20:20:03Z) - A Novel Unified Model for Multi-exposure Stereo Coding Based on Low Rank
Tucker-ALS and 3D-HEVC [0.6091702876917279]
We propose an efficient scheme for coding multi-exposure stereo images based on a tensor low-rank approximation scheme.
The multi-exposure fusion can be realized to generate HDR stereo output at the decoder for increased realism and binocular 3D depth cues.
The encoding with 3D-HEVC enhance the proposed scheme efficiency by exploiting intra-frame, inter-view and the inter-component redundancies in lowrank approximated representation.
arXiv Detail & Related papers (2021-04-10T10:10:14Z) - SMD-Nets: Stereo Mixture Density Networks [68.56947049719936]
We propose Stereo Mixture Density Networks (SMD-Nets), a simple yet effective learning framework compatible with a wide class of 2D and 3D architectures.
Specifically, we exploit bimodal mixture densities as output representation and show that this allows for sharp and precise disparity estimates near discontinuities.
We carry out comprehensive experiments on a new high-resolution and highly realistic synthetic stereo dataset, consisting of stereo pairs at 8Mpx resolution, as well as on real-world stereo datasets.
arXiv Detail & Related papers (2021-04-08T16:15:46Z) - Du$^2$Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels [16.797169907541164]
We present a novel approach based on neural networks for depth estimation that combines stereo from dual cameras with stereo from a dual-pixel sensor.
Our network uses a novel architecture to fuse these two sources of information and can overcome the limitations of pure binocular stereo matching.
arXiv Detail & Related papers (2020-03-31T15:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.