DiffuStereo: High Quality Human Reconstruction via Diffusion-based
Stereo Using Sparse Cameras
- URL: http://arxiv.org/abs/2207.08000v2
- Date: Wed, 20 Jul 2022 08:12:00 GMT
- Title: DiffuStereo: High Quality Human Reconstruction via Diffusion-based
Stereo Using Sparse Cameras
- Authors: Ruizhi Shao, Zerong Zheng, Hongwen Zhang, Jingxiang Sun, Yebin Liu
- Abstract summary: We propose DiffuStereo, a novel system using only sparse cameras for high-quality 3D human reconstruction.
At its core is a novel diffusion-based stereo module, which introduces diffusion models into the iterative stereo matching network.
We present a multi-level stereo network architecture to handle high-resolution (up to 4k) inputs without requiring unaffordable memory footprint.
- Score: 33.6247548142638
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose DiffuStereo, a novel system using only sparse cameras (8 in this
work) for high-quality 3D human reconstruction. At its core is a novel
diffusion-based stereo module, which introduces diffusion models, a type of
powerful generative models, into the iterative stereo matching network. To this
end, we design a new diffusion kernel and additional stereo constraints to
facilitate stereo matching and depth estimation in the network. We further
present a multi-level stereo network architecture to handle high-resolution (up
to 4k) inputs without requiring unaffordable memory footprint. Given a set of
sparse-view color images of a human, the proposed multi-level diffusion-based
stereo network can produce highly accurate depth maps, which are then converted
into a high-quality 3D human model through an efficient multi-view fusion
strategy. Overall, our method enables automatic reconstruction of human models
with quality on par to high-end dense-view camera rigs, and this is achieved
using a much more light-weight hardware setup. Experiments show that our method
outperforms state-of-the-art methods by a large margin both qualitatively and
quantitatively.
Related papers
- Pixel-Aligned Multi-View Generation with Depth Guided Decoder [86.1813201212539]
We propose a novel method for pixel-level image-to-multi-view generation.
Unlike prior work, we incorporate attention layers across multi-view images in the VAE decoder of a latent video diffusion model.
Our model enables better pixel alignment across multi-view images.
arXiv Detail & Related papers (2024-08-26T04:56:41Z) - Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention [87.02613021058484]
We introduce Era3D, a novel multiview diffusion method that generates high-resolution multiview images from a single-view image.
Era3D generates high-quality multiview images with up to a 512*512 resolution while reducing complexity by 12x times.
arXiv Detail & Related papers (2024-05-19T17:13:16Z) - MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation [54.27399121779011]
We present MVD-Fusion: a method for single-view 3D inference via generative modeling of multi-view-consistent RGB-D images.
We show that our approach can yield more accurate synthesis compared to recent state-of-the-art, including distillation-based 3D inference and prior multi-view generation methods.
arXiv Detail & Related papers (2024-04-04T17:59:57Z) - MEStereo-Du2CNN: A Novel Dual Channel CNN for Learning Robust Depth
Estimates from Multi-exposure Stereo Images for HDR 3D Applications [0.22940141855172028]
We develop a novel deep architecture for multi-exposure stereo depth estimation.
For the stereo depth estimation component of our architecture, a mono-to-stereo transfer learning approach is deployed.
In terms of performance, the proposed model surpasses state-of-the-art monocular and stereo depth estimation methods.
arXiv Detail & Related papers (2022-06-21T13:23:22Z) - Neural 3D Reconstruction in the Wild [86.6264706256377]
We introduce a new method that enables efficient and accurate surface reconstruction from Internet photo collections.
We present a new benchmark and protocol for evaluating reconstruction performance on such in-the-wild scenes.
arXiv Detail & Related papers (2022-05-25T17:59:53Z) - Neural Disparity Refinement for Arbitrary Resolution Stereo [67.55946402652778]
We introduce a novel architecture for neural disparity refinement aimed at facilitating deployment of 3D computer vision on cheap and widespread consumer devices.
Our approach relies on a continuous formulation that enables to estimate a refined disparity map at any arbitrary output resolution.
arXiv Detail & Related papers (2021-10-28T18:00:00Z) - CodedStereo: Learned Phase Masks for Large Depth-of-field Stereo [24.193656749401075]
Conventional stereo suffers from a fundamental trade-off between imaging volume and signal-to-noise ratio.
We propose a novel end-to-end learning-based technique to overcome this limitation.
We show a 6x increase in volume that can be imaged in simulation.
arXiv Detail & Related papers (2021-04-09T23:44:52Z) - Polka Lines: Learning Structured Illumination and Reconstruction for
Active Stereo [52.68109922159688]
We introduce a novel differentiable image formation model for active stereo, relying on both wave and geometric optics, and a novel trinocular reconstruction network.
The jointly optimized pattern, which we dub "Polka Lines," together with the reconstruction network, achieve state-of-the-art active-stereo depth estimates across imaging conditions.
arXiv Detail & Related papers (2020-11-26T04:02:43Z) - Du$^2$Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels [16.797169907541164]
We present a novel approach based on neural networks for depth estimation that combines stereo from dual cameras with stereo from a dual-pixel sensor.
Our network uses a novel architecture to fuse these two sources of information and can overcome the limitations of pure binocular stereo matching.
arXiv Detail & Related papers (2020-03-31T15:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.