A Rotated Hyperbolic Wrapped Normal Distribution for Hierarchical
Representation Learning
- URL: http://arxiv.org/abs/2205.13371v1
- Date: Wed, 25 May 2022 07:21:45 GMT
- Title: A Rotated Hyperbolic Wrapped Normal Distribution for Hierarchical
Representation Learning
- Authors: Seunghyuk Cho, Juyong Lee, Jaesik Park, Dongwoo Kim
- Abstract summary: We present a rotated hyperbolic wrapped normal distribution (RoWN), a simple yet effective alteration of a hyperbolic wrapped normal distribution (HWN)
In this work, we analyze the geometric properties of the diagonal HWN, a standard choice of distribution in probabilistic modeling.
We show how RoWN, the newly proposed distribution, can alleviate the limitations on various hierarchical datasets, including noisy synthetic binary tree, WordNet, and Atari 2600 Breakout.
- Score: 9.980145127016172
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a rotated hyperbolic wrapped normal distribution (RoWN), a simple
yet effective alteration of a hyperbolic wrapped normal distribution (HWN). The
HWN expands the domain of probabilistic modeling from Euclidean to hyperbolic
space, where a tree can be embedded with arbitrary low distortion in theory. In
this work, we analyze the geometric properties of the diagonal HWN, a standard
choice of distribution in probabilistic modeling. The analysis shows that the
distribution is inappropriate to represent the data points at the same
hierarchy level through their angular distance with the same norm in the
Poincar\'e disk model. We then empirically verify the presence of limitations
of HWN, and show how RoWN, the newly proposed distribution, can alleviate the
limitations on various hierarchical datasets, including noisy synthetic binary
tree, WordNet, and Atari 2600 Breakout.
Related papers
- Categorical SDEs with Simplex Diffusion [25.488210663637265]
This theoretical note proposes Simplex Diffusion, a means to directly diffuse datapoints located on an n-dimensional probability simplex.
We show how this relates to the Dirichlet distribution on the simplex and how the analogous SDE is realized thanks to a multi-dimensional Cox-Ingersoll-Ross process.
arXiv Detail & Related papers (2022-10-26T15:27:43Z) - ManiFlow: Implicitly Representing Manifolds with Normalizing Flows [145.9820993054072]
Normalizing Flows (NFs) are flexible explicit generative models that have been shown to accurately model complex real-world data distributions.
We propose an optimization objective that recovers the most likely point on the manifold given a sample from the perturbed distribution.
Finally, we focus on 3D point clouds for which we utilize the explicit nature of NFs, i.e. surface normals extracted from the gradient of the log-likelihood and the log-likelihood itself.
arXiv Detail & Related papers (2022-08-18T16:07:59Z) - A Unification Framework for Euclidean and Hyperbolic Graph Neural
Networks [8.080621697426997]
Hyperbolic neural networks can effectively capture the inherent hierarchy of graph datasets.
They entangle multiple incongruent (gyro-)vector spaces within a layer, which makes them limited in terms of generalization and scalability.
We propose the Poincare disk model as our search space, and apply all approximations on the disk.
We demonstrate that our model not only leverages the power of Euclidean networks such as interpretability and efficient execution of various model components, but also outperforms both Euclidean and hyperbolic counterparts on various benchmarks.
arXiv Detail & Related papers (2022-06-09T05:33:02Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - Geometric Graph Representation Learning via Maximizing Rate Reduction [73.6044873825311]
Learning node representations benefits various downstream tasks in graph analysis such as community detection and node classification.
We propose Geometric Graph Representation Learning (G2R) to learn node representations in an unsupervised manner.
G2R maps nodes in distinct groups into different subspaces, while each subspace is compact and different subspaces are dispersed.
arXiv Detail & Related papers (2022-02-13T07:46:24Z) - Arbitrary Conditional Distributions with Energy [11.081460215563633]
A more general and useful problem is arbitrary conditional density estimation.
We propose a novel method, Arbitrary Conditioning with Energy (ACE), that can simultaneously estimate the distribution $p(mathbfx_u mid mathbfx_o)$.
We also simplify the learning problem by only learning one-dimensional conditionals, from which more complex distributions can be recovered during inference.
arXiv Detail & Related papers (2021-02-08T18:36:26Z) - Generative Model without Prior Distribution Matching [26.91643368299913]
Variational Autoencoder (VAE) and its variations are classic generative models by learning a low-dimensional latent representation to satisfy some prior distribution.
We propose to let the prior match the embedding distribution rather than imposing the latent variables to fit the prior.
arXiv Detail & Related papers (2020-09-23T09:33:24Z) - Variational Hyper-Encoding Networks [62.74164588885455]
We propose a framework called HyperVAE for encoding distributions of neural network parameters theta.
We predict the posterior distribution of the latent code, then use a matrix-network decoder to generate a posterior distribution q(theta)
arXiv Detail & Related papers (2020-05-18T06:46:09Z) - GANs with Conditional Independence Graphs: On Subadditivity of
Probability Divergences [70.30467057209405]
Generative Adversarial Networks (GANs) are modern methods to learn the underlying distribution of a data set.
GANs are designed in a model-free fashion where no additional information about the underlying distribution is available.
We propose a principled design of a model-based GAN that uses a set of simple discriminators on the neighborhoods of the Bayes-net/MRF.
arXiv Detail & Related papers (2020-03-02T04:31:22Z) - Latent Variable Modelling with Hyperbolic Normalizing Flows [35.1659722563025]
We introduce a novel normalizing flow over hyperbolic VAEs and Euclidean normalizing flows.
Our approach achieves improved performance on density estimation, as well as reconstruction of real-world graph data.
arXiv Detail & Related papers (2020-02-15T07:44:00Z) - Block-Approximated Exponential Random Graphs [77.4792558024487]
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs.
We propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions.
Our methods are scalable to sparse graphs consisting of millions of nodes.
arXiv Detail & Related papers (2020-02-14T11:42:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.