Deep Permutation Equivariant Structure from Motion
- URL: http://arxiv.org/abs/2104.06703v1
- Date: Wed, 14 Apr 2021 08:50:06 GMT
- Title: Deep Permutation Equivariant Structure from Motion
- Authors: Dror Moran, Hodaya Koslowsky, Yoni Kasten, Haggai Maron, Meirav Galun,
Ronen Basri
- Abstract summary: Existing deep methods produce highly accurate 3D reconstructions in stereo and multiview stereo settings.
We propose a neural network architecture that recovers both the camera parameters and a scene structure by minimizing an unsupervised reprojection loss.
Our experiments, conducted on a variety of datasets in both internally calibrated and uncalibrated settings, indicate that our method accurately recovers pose and structure, on par with classical state of the art methods.
- Score: 38.68492294795315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing deep methods produce highly accurate 3D reconstructions in stereo
and multiview stereo settings, i.e., when cameras are both internally and
externally calibrated. Nevertheless, the challenge of simultaneous recovery of
camera poses and 3D scene structure in multiview settings with deep networks is
still outstanding. Inspired by projective factorization for Structure from
Motion (SFM) and by deep matrix completion techniques, we propose a neural
network architecture that, given a set of point tracks in multiple images of a
static scene, recovers both the camera parameters and a (sparse) scene
structure by minimizing an unsupervised reprojection loss. Our network
architecture is designed to respect the structure of the problem: the sought
output is equivariant to permutations of both cameras and scene points.
Notably, our method does not require initialization of camera parameters or 3D
point locations. We test our architecture in two setups: (1) single scene
reconstruction and (2) learning from multiple scenes. Our experiments,
conducted on a variety of datasets in both internally calibrated and
uncalibrated settings, indicate that our method accurately recovers pose and
structure, on par with classical state of the art methods. Additionally, we
show that a pre-trained network can be used to reconstruct novel scenes using
inexpensive fine-tuning with no loss of accuracy.
Related papers
- DUSt3R: Geometric 3D Vision Made Easy [8.471330244002564]
We introduce DUSt3R, a novel paradigm for Dense and Unconstrained Stereo 3D Reconstruction of arbitrary image collections.
We show that this formulation smoothly unifies the monocular and binocular reconstruction cases.
Our formulation directly provides a 3D model of the scene as well as depth information, but interestingly, we can seamlessly recover from it, pixel matches, relative and absolute camera.
arXiv Detail & Related papers (2023-12-21T18:52:14Z) - Visual Geometry Grounded Deep Structure From Motion [20.203320509695306]
We propose a new deep pipeline VGGSfM, where each component is fully differentiable and can be trained in an end-to-end manner.
First, we build on recent advances in deep 2D point tracking to extract reliable pixel-accurate tracks, which eliminates the need for chaining pairwise matches.
We attain state-of-the-art performance on three popular datasets, CO3D, IMC Phototourism, and ETH3D.
arXiv Detail & Related papers (2023-12-07T18:59:52Z) - R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras [106.52409577316389]
R3D3 is a multi-camera system for dense 3D reconstruction and ego-motion estimation.
Our approach exploits spatial-temporal information from multiple cameras, and monocular depth refinement.
We show that this design enables a dense, consistent 3D reconstruction of challenging, dynamic outdoor environments.
arXiv Detail & Related papers (2023-08-28T17:13:49Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized
Photography [54.36608424943729]
We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth.
We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
arXiv Detail & Related papers (2022-12-22T18:54:34Z) - Towards Non-Line-of-Sight Photography [48.491977359971855]
Non-line-of-sight (NLOS) imaging is based on capturing the multi-bounce indirect reflections from the hidden objects.
Active NLOS imaging systems rely on the capture of the time of flight of light through the scene.
We propose a new problem formulation, called NLOS photography, to specifically address this deficiency.
arXiv Detail & Related papers (2021-09-16T08:07:13Z) - Reconstructing Small 3D Objects in front of a Textured Background [0.0]
We present a technique for a complete 3D reconstruction of small objects moving in front of a textured background.
It is a particular variation of multibody structure from motion, which specializes to two objects only.
In experiments with real artifacts, we show that our approach has practical advantages when reconstructing 3D objects from all sides.
arXiv Detail & Related papers (2021-05-24T15:36:33Z) - Vid2Curve: Simultaneous Camera Motion Estimation and Thin Structure
Reconstruction from an RGB Video [90.93141123721713]
Thin structures, such as wire-frame sculptures, fences, cables, power lines, and tree branches, are common in the real world.
It is extremely challenging to acquire their 3D digital models using traditional image-based or depth-based reconstruction methods because thin structures often lack distinct point features and have severe self-occlusion.
We propose the first approach that simultaneously estimates camera motion and reconstructs the geometry of complex 3D thin structures in high quality from a color video captured by a handheld camera.
arXiv Detail & Related papers (2020-05-07T10:39:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.