Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized
Photography
- URL: http://arxiv.org/abs/2212.12324v2
- Date: Mon, 27 Mar 2023 18:54:46 GMT
- Title: Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized
Photography
- Authors: Ilya Chugunov, Yuxuan Zhang, Felix Heide
- Abstract summary: We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth.
We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
- Score: 54.36608424943729
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern mobile burst photography pipelines capture and merge a short sequence
of frames to recover an enhanced image, but often disregard the 3D nature of
the scene they capture, treating pixel motion between images as a 2D
aggregation problem. We show that in a ''long-burst'', forty-two 12-megapixel
RAW frames captured in a two-second sequence, there is enough parallax
information from natural hand tremor alone to recover high-quality scene depth.
To this end, we devise a test-time optimization approach that fits a neural
RGB-D representation to long-burst data and simultaneously estimates scene
depth and camera motion. Our plane plus depth model is trained end-to-end, and
performs coarse-to-fine refinement by controlling which multi-resolution volume
features the network has access to at what time during training. We validate
the method experimentally, and demonstrate geometrically accurate depth
reconstructions with no additional hardware or separate data pre-processing and
pose-estimation steps.
Related papers
- FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - Towards Accurate Reconstruction of 3D Scene Shape from A Single
Monocular Image [91.71077190961688]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.
We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot evaluation.
arXiv Detail & Related papers (2022-08-28T16:20:14Z) - DEVO: Depth-Event Camera Visual Odometry in Challenging Conditions [30.892930944644853]
We present a novel real-time visual odometry framework for a stereo setup of a depth and high-resolution event camera.
Our framework balances accuracy and robustness against computational efficiency towards strong performance in challenging scenarios.
arXiv Detail & Related papers (2022-02-05T13:46:47Z) - Towards Non-Line-of-Sight Photography [48.491977359971855]
Non-line-of-sight (NLOS) imaging is based on capturing the multi-bounce indirect reflections from the hidden objects.
Active NLOS imaging systems rely on the capture of the time of flight of light through the scene.
We propose a new problem formulation, called NLOS photography, to specifically address this deficiency.
arXiv Detail & Related papers (2021-09-16T08:07:13Z) - Learning to Recover 3D Scene Shape from a Single Image [98.20106822614392]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape.
arXiv Detail & Related papers (2020-12-17T02:35:13Z) - Robust Consistent Video Depth Estimation [65.53308117778361]
We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video.
Our algorithm combines two complementary techniques: (1) flexible deformation-splines for low-frequency large-scale alignment and (2) geometry-aware depth filtering for high-frequency alignment of fine depth details.
In contrast to prior approaches, our method does not require camera poses as input and achieves robust reconstruction for challenging hand-held cell phone captures containing a significant amount of noise, shake, motion blur, and rolling shutter deformations.
arXiv Detail & Related papers (2020-12-10T18:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.