Learning Neural Representation of Camera Pose with Matrix Representation
of Pose Shift via View Synthesis
- URL: http://arxiv.org/abs/2104.01508v1
- Date: Sun, 4 Apr 2021 00:40:53 GMT
- Title: Learning Neural Representation of Camera Pose with Matrix Representation
of Pose Shift via View Synthesis
- Authors: Yaxuan Zhu, Ruiqi Gao, Siyuan Huang, Song-chun Zhu, Yingnian Wu
- Abstract summary: How to effectively represent camera pose is an essential problem in 3D computer vision.
We propose an approach to learn neural representations of camera poses and 3D scenes.
We conduct extensive experiments on synthetic and real datasets.
- Score: 105.37072293076767
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How to effectively represent camera pose is an essential problem in 3D
computer vision, especially in tasks such as camera pose regression and novel
view synthesis. Traditionally, 3D position of the camera is represented by
Cartesian coordinate and the orientation is represented by Euler angle or
quaternions. These representations are manually designed, which may not be the
most effective representation for downstream tasks. In this work, we propose an
approach to learn neural representations of camera poses and 3D scenes, coupled
with neural representations of local camera movements. Specifically, the camera
pose and 3D scene are represented as vectors and the local camera movement is
represented as a matrix operating on the vector of the camera pose. We
demonstrate that the camera movement can further be parametrized by a matrix
Lie algebra that underlies a rotation system in the neural space. The vector
representations are then concatenated and generate the posed 2D image through a
decoder network. The model is learned from only posed 2D images and
corresponding camera poses, without access to depths or shapes. We conduct
extensive experiments on synthetic and real datasets. The results show that
compared with other camera pose representations, our learned representation is
more robust to noise in novel view synthesis and more effective in camera pose
regression.
Related papers
- COLMAP-Free 3D Gaussian Splatting [88.420322646756]
We propose a novel method to perform novel view synthesis without any SfM preprocessing.
We process the input frames in a sequential manner and progressively grow the 3D Gaussians set by taking one input frame at a time.
Our method significantly improves over previous approaches in view synthesis and camera pose estimation under large motion changes.
arXiv Detail & Related papers (2023-12-12T18:39:52Z) - FlowCam: Training Generalizable 3D Radiance Fields without Camera Poses
via Pixel-Aligned Scene Flow [26.528667940013598]
Reconstruction of 3D neural fields from posed images has emerged as a promising method for self-supervised representation learning.
Key challenge preventing the deployment of these 3D scene learners on large-scale video data is their dependence on precise camera poses from structure-from-motion.
We propose a method that jointly reconstructs camera poses and 3D neural scene representations online and in a single forward pass.
arXiv Detail & Related papers (2023-05-31T20:58:46Z) - Inverting the Imaging Process by Learning an Implicit Camera Model [73.81635386829846]
This paper proposes a novel implicit camera model which represents the physical imaging process of a camera as a deep neural network.
We demonstrate the power of this new implicit camera model on two inverse imaging tasks.
arXiv Detail & Related papers (2023-04-25T11:55:03Z) - RUST: Latent Neural Scene Representations from Unposed Imagery [21.433079925439234]
Inferring structure of 3D scenes from 2D observations is a fundamental challenge in computer vision.
Recent popularized approaches based on neural scene representations have achieved tremendous impact.
RUST (Really Unposed Scene representation Transformer) is a pose-free approach to novel view trained on RGB images alone.
arXiv Detail & Related papers (2022-11-25T18:59:10Z) - EpipolarNVS: leveraging on Epipolar geometry for single-image Novel View
Synthesis [6.103988053817792]
Novel-view synthesis (NVS) can be tackled through different approaches, depending on the general setting.
The most challenging scenario, the one where we stand in this work, only considers a unique source image to generate a novel one from another viewpoint.
We introduce an innovative method that encodes the viewpoint transformation as a 2D feature image.
arXiv Detail & Related papers (2022-10-24T09:54:20Z) - GaussiGAN: Controllable Image Synthesis with 3D Gaussians from Unposed
Silhouettes [48.642181362172906]
We present an algorithm that learns a coarse 3D representation of objects from unposed multi-view 2D mask supervision.
In contrast to existing voxel-based methods for unposed object reconstruction, our approach learns to represent the generated shape and pose.
We show results on synthetic datasets with realistic lighting, and demonstrate object insertion with interactive posing.
arXiv Detail & Related papers (2021-06-24T17:47:58Z) - CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields [67.76151996543588]
We learn a 3D- and camera-aware generative model which faithfully recovers not only the image but also the camera data distribution.
At test time, our model generates images with explicit control over the camera as well as the shape and appearance of the scene.
arXiv Detail & Related papers (2021-03-31T17:59:24Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.