A Differential Geometry Perspective on Orthogonal Recurrent Models
- URL: http://arxiv.org/abs/2102.09589v1
- Date: Thu, 18 Feb 2021 19:39:22 GMT
- Title: A Differential Geometry Perspective on Orthogonal Recurrent Models
- Authors: Omri Azencot, N. Benjamin Erichson, Mirela Ben-Chen, Michael W.
Mahoney
- Abstract summary: We employ tools and insights from differential geometry to offer a novel perspective on orthogonal RNNs.
We show that orthogonal RNNs may be viewed as optimizing in the space of divergence-free vector fields.
Motivated by this observation, we study a new recurrent model, which spans the entire space of vector fields.
- Score: 56.09491978954866
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, orthogonal recurrent neural networks (RNNs) have emerged as
state-of-the-art models for learning long-term dependencies. This class of
models mitigates the exploding and vanishing gradients problem by design. In
this work, we employ tools and insights from differential geometry to offer a
novel perspective on orthogonal RNNs. We show that orthogonal RNNs may be
viewed as optimizing in the space of divergence-free vector fields.
Specifically, based on a well-known result in differential geometry that
relates vector fields and linear operators, we prove that every divergence-free
vector field is related to a skew-symmetric matrix. Motivated by this
observation, we study a new recurrent model, which spans the entire space of
vector fields. Our method parameterizes vector fields via the directional
derivatives of scalar functions. This requires the construction of latent inner
product, gradient, and divergence operators. In comparison to state-of-the-art
orthogonal RNNs, our approach achieves comparable or better results on a
variety of benchmark tasks.
Related papers
- An Intrinsic Vector Heat Network [64.55434397799728]
This paper introduces a novel neural network architecture for learning tangent vector fields embedded in 3D.
We introduce a trainable vector heat diffusion module to spatially propagate vector-valued feature data across the surface.
We also demonstrate the effectiveness of our method on the useful industrial application of quadrilateral mesh generation.
arXiv Detail & Related papers (2024-06-14T00:40:31Z) - Geometric Neural Diffusion Processes [55.891428654434634]
We extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling.
We show that with these conditions, the generative functional model admits the same symmetry.
arXiv Detail & Related papers (2023-07-11T16:51:38Z) - Semisupervised regression in latent structure networks on unknown
manifolds [7.5722195869569]
We consider random dot product graphs, in which an edge is formed between two nodes with probability given by the inner product of their respective latent positions.
We propose a manifold learning and graph embedding technique to predict the response variable on out-of-sample nodes.
arXiv Detail & Related papers (2023-05-04T00:41:04Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Designing Universal Causal Deep Learning Models: The Case of
Infinite-Dimensional Dynamical Systems from Stochastic Analysis [3.5450828190071655]
Causal operators (COs) play a central role in contemporary analysis.
There is still no canonical framework for designing Deep Learning (DL) models capable of approximating COs.
This paper proposes a "geometry-aware" solution to this open problem by introducing a DL model-design framework.
arXiv Detail & Related papers (2022-10-24T14:43:03Z) - Input Convex Gradient Networks [7.747759814657507]
We study how to model convex gradients by integrating a Jacobian-vector product parameterized by a neural network.
We empirically demonstrate that a single layer ICGN can fit a toy example better than a single layer ICNN.
arXiv Detail & Related papers (2021-11-23T22:51:25Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - Feature Engineering with Regularity Structures [4.082216579462797]
We investigate the use of models from the theory of regularity structures as features in machine learning tasks.
We provide a flexible definition of a model feature vector associated to a space-time signal, along with two algorithms which illustrate ways in which these features can be combined with linear regression.
We apply these algorithms in several numerical experiments designed to learn solutions to PDEs with a given forcing and boundary data.
arXiv Detail & Related papers (2021-08-12T17:53:47Z) - Disentangled Representation Learning and Generation with Manifold
Optimization [10.69910379275607]
This work presents a representation learning framework that explicitly promotes disentanglement by encouraging directions of variations.
Our theoretical discussion and various experiments show that the proposed model improves over many VAE variants in terms of both generation quality and disentangled representation learning.
arXiv Detail & Related papers (2020-06-12T10:00:49Z) - Understanding Graph Neural Networks with Generalized Geometric
Scattering Transforms [67.88675386638043]
The scattering transform is a multilayered wavelet-based deep learning architecture that acts as a model of convolutional neural networks.
We introduce windowed and non-windowed geometric scattering transforms for graphs based upon a very general class of asymmetric wavelets.
We show that these asymmetric graph scattering transforms have many of the same theoretical guarantees as their symmetric counterparts.
arXiv Detail & Related papers (2019-11-14T17:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.