Hyperbolic Neural Networks++
- URL: http://arxiv.org/abs/2006.08210v3
- Date: Wed, 17 Mar 2021 14:36:34 GMT
- Title: Hyperbolic Neural Networks++
- Authors: Ryohei Shimizu, Yusuke Mukuta, Tatsuya Harada
- Abstract summary: We generalize the fundamental components of neural networks in a single hyperbolic geometry model, namely, the Poincar'e ball model.
Experiments show the superior parameter efficiency of our methods compared to conventional hyperbolic components, and stability and outperformance over their Euclidean counterparts.
- Score: 66.16106727715061
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hyperbolic spaces, which have the capacity to embed tree structures without
distortion owing to their exponential volume growth, have recently been applied
to machine learning to better capture the hierarchical nature of data. In this
study, we generalize the fundamental components of neural networks in a single
hyperbolic geometry model, namely, the Poincar\'e ball model. This novel
methodology constructs a multinomial logistic regression, fully-connected
layers, convolutional layers, and attention mechanisms under a unified
mathematical interpretation, without increasing the parameters. Experiments
show the superior parameter efficiency of our methods compared to conventional
hyperbolic components, and stability and outperformance over their Euclidean
counterparts.
Related papers
- Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - Nonlinear classification of neural manifolds with contextual information [6.292933471495322]
manifold capacity has emerged as a promising framework linking population geometry to the separability of neural manifold.
We propose a theoretical framework that overcomes this limitation by leveraging contextual input information.
Our framework's increased expressivity captures representation untanglement in deep networks at early stages of the layer hierarchy, previously inaccessible to analysis.
arXiv Detail & Related papers (2024-05-10T23:37:31Z) - Decorrelating neurons using persistence [29.25969187808722]
We present two regularisation terms computed from the weights of a minimum spanning tree of a clique.
We demonstrate that naive minimisation of all correlations between neurons obtains lower accuracies than our regularisation terms.
We include a proof of differentiability of our regularisers, thus developing the first effective topological persistence-based regularisation terms.
arXiv Detail & Related papers (2023-08-09T11:09:14Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Hyperbolic Deep Neural Networks: A Survey [31.04110049167551]
We refer to the model as hyperbolic deep neural network in this paper.
To stimulate future research, this paper presents acoherent and comprehensive review of the literature around the neural components in the construction of hyperbolic deep neuralnetworks.
arXiv Detail & Related papers (2021-01-12T15:55:16Z) - Geometry Perspective Of Estimating Learning Capability Of Neural
Networks [0.0]
The paper considers a broad class of neural networks with generalized architecture performing simple least square regression with gradient descent (SGD)
The relationship between the generalization capability with the stability of the neural network has also been discussed.
By correlating the principles of high-energy physics with the learning theory of neural networks, the paper establishes a variant of the Complexity-Action conjecture from an artificial neural network perspective.
arXiv Detail & Related papers (2020-11-03T12:03:19Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective.
We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z) - Geometric deep learning for computational mechanics Part I: Anisotropic
Hyperelasticity [1.8606313462183062]
This paper is the first attempt to use geometric deep learning and Sobolev training incorporate non-Euclidean microstructural data such that anisotropic hyperstructural material machine learning models can be trained in the finite deformation range.
arXiv Detail & Related papers (2020-01-08T02:07:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.