Equivariant Light Field Convolution and Transformer
- URL: http://arxiv.org/abs/2212.14871v2
- Date: Wed, 7 Jun 2023 18:00:48 GMT
- Title: Equivariant Light Field Convolution and Transformer
- Authors: Yinshuang Xu, Jiahui Lei, Kostas Daniilidis
- Abstract summary: Deep learning of geometric priors from 2D images often requires each image to be represented in a $2D$ canonical frame.
We show how to learn priors from multiple views equivariant to coordinate frame transformations by proposing an $SE(3)$-equivariant convolution and transformer in the space of rays in 3D.
- Score: 40.840098156362316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D reconstruction and novel view rendering can greatly benefit from geometric
priors when the input views are not sufficient in terms of coverage and
inter-view baselines. Deep learning of geometric priors from 2D images often
requires each image to be represented in a $2D$ canonical frame and the prior
to be learned in a given or learned $3D$ canonical frame. In this paper, given
only the relative poses of the cameras, we show how to learn priors from
multiple views equivariant to coordinate frame transformations by proposing an
$SE(3)$-equivariant convolution and transformer in the space of rays in 3D.
This enables the creation of a light field that remains equivariant to the
choice of coordinate frame. The light field as defined in our work, refers both
to the radiance field and the feature field defined on the ray space. We model
the ray space, the domain of the light field, as a homogeneous space of $SE(3)$
and introduce the $SE(3)$-equivariant convolution in ray space. Depending on
the output domain of the convolution, we present convolution-based
$SE(3)$-equivariant maps from ray space to ray space and to $\mathbb{R}^3$. Our
mathematical framework allows us to go beyond convolution to
$SE(3)$-equivariant attention in the ray space. We demonstrate how to tailor
and adapt the equivariant convolution and transformer in the tasks of
equivariant neural rendering and $3D$ reconstruction from multiple views. We
demonstrate $SE(3)$-equivariance by obtaining robust results in roto-translated
datasets without performing transformation augmentation.
Related papers
- $π^3$: Scalable Permutation-Equivariant Visual Geometry Learning [50.80418813055225]
$pi3$ is a feed-forward neural network that offers a novel approach to visual geometry reconstruction.<n>pi3$ employs a fully permutation-equivariant architecture to predict affine-invariant camera poses and scale-invariant local point maps.
arXiv Detail & Related papers (2025-07-17T17:59:53Z) - You Need a Transition Plane: Bridging Continuous Panoramic 3D Reconstruction with Perspective Gaussian Splatting [57.44295803750027]
We present a novel framework, named TPGS, to bridge continuous panoramic 3D scene reconstruction with perspective Gaussian splatting.
Specifically, we optimize 3D Gaussians within individual cube faces and then fine-tune them in the stitched panoramic space.
Experiments on indoor and outdoor, egocentric, and roaming benchmark datasets demonstrate that our approach outperforms existing state-of-the-art methods.
arXiv Detail & Related papers (2025-04-12T03:42:50Z) - ODGS: 3D Scene Reconstruction from Omnidirectional Images with 3D Gaussian Splattings [48.72040500647568]
We present ODGS, a novelization pipeline for omnidirectional images, with geometric interpretation.
The entire pipeline is parallelized using, achieving optimization and speeds 100 times faster than NeRF-based methods.
Results show ODGS restores fine details effectively, even when reconstructing large 3D scenes.
arXiv Detail & Related papers (2024-10-28T02:45:13Z) - Learning Naturally Aggregated Appearance for Efficient 3D Editing [94.47518916521065]
We propose to replace the color field with an explicit 2D appearance aggregation, also called canonical image.
To avoid the distortion effect and facilitate convenient editing, we complement the canonical image with a projection field that maps 3D points onto 2D pixels for texture lookup.
Our representation, dubbed AGAP, well supports various ways of 3D editing (e.g., stylization, interactive drawing, and content extraction) with no need of re-optimization.
arXiv Detail & Related papers (2023-12-11T18:59:31Z) - MoDA: Modeling Deformable 3D Objects from Casual Videos [84.29654142118018]
We propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation without skin-collapsing artifacts.
In the endeavor to register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space.
Our approach can reconstruct 3D models for humans and animals with better qualitative and quantitative performance than state-of-the-art methods.
arXiv Detail & Related papers (2023-04-17T13:49:04Z) - Equivalence Between SE(3) Equivariant Networks via Steerable Kernels and
Group Convolution [90.67482899242093]
A wide range of techniques have been proposed in recent years for designing neural networks for 3D data that are equivariant under rotation and translation of the input.
We provide an in-depth analysis of both methods and their equivalence and relate the two constructions to multiview convolutional networks.
We also derive new TFN non-linearities from our equivalence principle and test them on practical benchmark datasets.
arXiv Detail & Related papers (2022-11-29T03:42:11Z) - EpiGRAF: Rethinking training of 3D GANs [60.38818140637367]
We show that it is possible to obtain a high-resolution 3D generator with SotA image quality by following a completely different route of simply training the model patch-wise.
The resulting model, named EpiGRAF, is an efficient, high-resolution, pure 3D generator.
arXiv Detail & Related papers (2022-06-21T17:08:23Z) - Rotation Equivariant 3D Hand Mesh Generation from a Single RGB Image [1.8692254863855962]
We develop a rotation equivariant model for generating 3D hand meshes from 2D RGB images.
This guarantees that as the input image of a hand is rotated the generated mesh undergoes a corresponding rotation.
arXiv Detail & Related papers (2021-11-25T11:07:27Z) - i3dLoc: Image-to-range Cross-domain Localization Robust to Inconsistent
Environmental Conditions [9.982307144353713]
We present a method for localizing a single camera with respect to a point cloud map in indoor and outdoor scenes.
Our method can match equirectangular images to the 3D range projections by extracting cross-domain symmetric place descriptors.
With a single trained model, i3dLoc can demonstrate reliable visual localization in random conditions.
arXiv Detail & Related papers (2021-05-27T00:13:11Z) - Equivariant Point Network for 3D Point Cloud Analysis [17.689949017410836]
We propose an effective and practical SE(3) (3D translation and rotation) equivariant network for point cloud analysis.
First, we present SE(3) separable point convolution, a novel framework that breaks down the 6D convolution into two separable convolutional operators.
Second, we introduce an attention layer to effectively harness the expressiveness of the equivariant features.
arXiv Detail & Related papers (2021-03-25T21:57:10Z) - Rotation-Invariant Autoencoders for Signals on Spheres [10.406659081400354]
We study the problem of unsupervised learning of rotation-invariant representations for spherical images.
In particular, we design an autoencoder architecture consisting of $S2$ and $SO(3)$ convolutional layers.
Experiments on multiple datasets demonstrate the usefulness of the learned representations on clustering, retrieval and classification applications.
arXiv Detail & Related papers (2020-12-08T15:15:03Z) - Generalizing Spatial Transformers to Projective Geometry with
Applications to 2D/3D Registration [11.219924013808852]
Differentiable rendering is a technique to connect 3D scenes with corresponding 2D images.
We propose a novel Projective Spatial Transformer module that generalizes spatial transformers to projective geometry.
arXiv Detail & Related papers (2020-03-24T17:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.