Holographic-(V)AE: an end-to-end SO(3)-Equivariant (Variational)
Autoencoder in Fourier Space
- URL: http://arxiv.org/abs/2209.15567v2
- Date: Mon, 12 Jun 2023 03:54:09 GMT
- Title: Holographic-(V)AE: an end-to-end SO(3)-Equivariant (Variational)
Autoencoder in Fourier Space
- Authors: Gian Marco Visani, Michael N. Pun, Arman Angaji, Armita Nourmohammad
- Abstract summary: Group-equivariant neural networks have emerged as a data-efficient approach to solve classification and regression tasks.
Here, we present Holographic-(Variational) Autoencoder in Fourier space, suitable for unsupervised learning and generation of data distributed around a specified origin in 3D.
We show that the learned latent space efficiently encodes the categorical features of spherical images.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Group-equivariant neural networks have emerged as a data-efficient approach
to solve classification and regression tasks, while respecting the relevant
symmetries of the data. However, little work has been done to extend this
paradigm to the unsupervised and generative domains. Here, we present
Holographic-(Variational) Auto Encoder (H-(V)AE), a fully end-to-end
SO(3)-equivariant (variational) autoencoder in Fourier space, suitable for
unsupervised learning and generation of data distributed around a specified
origin in 3D. H-(V)AE is trained to reconstruct the spherical Fourier encoding
of data, learning in the process a low-dimensional representation of the data
(i.e., a latent space) with a maximally informative rotationally invariant
embedding alongside an equivariant frame describing the orientation of the
data. We extensively test the performance of H-(V)AE on diverse datasets. We
show that the learned latent space efficiently encodes the categorical features
of spherical images. Moreover, H-(V)AE's latent space can be used to extract
compact embeddings for protein structure microenvironments, and when paired
with a Random Forest Regressor, it enables state-of-the-art predictions of
protein-ligand binding affinity.
Related papers
- 3D Equivariant Pose Regression via Direct Wigner-D Harmonics Prediction [50.07071392673984]
Existing methods learn 3D rotations parametrized in the spatial domain using angles or quaternions.
We propose a frequency-domain approach that directly predicts Wigner-D coefficients for 3D rotation regression.
Our method achieves state-of-the-art results on benchmarks such as ModelNet10-SO(3) and PASCAL3D+.
arXiv Detail & Related papers (2024-11-01T12:50:38Z) - Uniform Transformation: Refining Latent Representation in Variational Autoencoders [7.4316292428754105]
We introduce a novel adaptable three-stage Uniform Transformation (UT) module to address irregular latent distributions.
By reconfiguring irregular distributions into a uniform distribution in the latent space, our approach significantly enhances the disentanglement and interpretability of latent representations.
Empirical evaluations demonstrated the efficacy of our proposed UT module in improving disentanglement metrics across benchmark datasets.
arXiv Detail & Related papers (2024-07-02T21:46:23Z) - IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images [50.4538089115248]
Generalizable 3D object reconstruction from single-view RGB-D images remains a challenging task.
We propose a novel approach, IPoD, which harmonizes implicit field learning with point diffusion.
Experiments conducted on the CO3D-v2 dataset affirm the superiority of IPoD, achieving 7.8% improvement in F-score and 28.6% in Chamfer distance over existing methods.
arXiv Detail & Related papers (2024-03-30T07:17:37Z) - Improved Cryo-EM Pose Estimation and 3D Classification through Latent-Space Disentanglement [14.973360669658561]
We propose a self-supervised variational autoencoder architecture called "HetACUMN" based on amortized inference.
Results on simulated datasets show that HetACUMN generated more accurate conformational classifications than other amortized or non-amortized methods.
arXiv Detail & Related papers (2023-08-09T13:41:30Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Score-based Diffusion Models in Function Space [140.792362459734]
Diffusion models have recently emerged as a powerful framework for generative modeling.
We introduce a mathematically rigorous framework called Denoising Diffusion Operators (DDOs) for training diffusion models in function space.
We show that the corresponding discretized algorithm generates accurate samples at a fixed cost independent of the data resolution.
arXiv Detail & Related papers (2023-02-14T23:50:53Z) - RENs: Relevance Encoding Networks [0.0]
This paper proposes relevance encoding networks (RENs): a novel probabilistic VAE-based framework that uses the automatic relevance determination (ARD) prior in the latent space to learn the data-specific bottleneck dimensionality.
We show that the proposed model learns the relevant latent bottleneck dimensionality without compromising the representation and generation quality of the samples.
arXiv Detail & Related papers (2022-05-25T21:53:48Z) - Geometry-Contrastive Transformer for Generalized 3D Pose Transfer [95.56457218144983]
The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism.
We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies.
We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
arXiv Detail & Related papers (2021-12-14T13:14:24Z) - Rotation-Invariant Local-to-Global Representation Learning for 3D Point
Cloud [42.86112554931754]
We propose a local-to-global representation learning algorithm for 3D point cloud data.
Our model takes advantage of multi-level abstraction based on graph convolutional neural networks.
The proposed algorithm presents the state-of-the-art performance on the rotation-augmented 3D object recognition and segmentation benchmarks.
arXiv Detail & Related papers (2020-10-07T10:30:20Z) - Spatial Information Guided Convolution for Real-Time RGBD Semantic
Segmentation [79.78416804260668]
We propose Spatial information guided Convolution (S-Conv), which allows efficient RGB feature and 3D spatial information integration.
S-Conv is competent to infer the sampling offset of the convolution kernel guided by the 3D spatial information.
We further embed S-Conv into a semantic segmentation network, called Spatial information Guided convolutional Network (SGNet)
arXiv Detail & Related papers (2020-04-09T13:38:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.