Abelian Neural Networks
- URL: http://arxiv.org/abs/2102.12232v1
- Date: Wed, 24 Feb 2021 11:52:21 GMT
- Title: Abelian Neural Networks
- Authors: Kenshin Abe and Takanori Maehara and Issei Sato
- Abstract summary: We first construct a neural network architecture for Abelian group operations and derive a universal approximation property.
We extend it to Abelian semigroup operations using the characterization of associative symmetrics.
We train our models over fixed word embeddings and demonstrate improved performance over the original word2vec.
- Score: 48.52497085313911
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of modeling a binary operation that satisfies some
algebraic requirements. We first construct a neural network architecture for
Abelian group operations and derive a universal approximation property. Then,
we extend it to Abelian semigroup operations using the characterization of
associative symmetric polynomials. Both models take advantage of the analytic
invertibility of invertible neural networks. For each case, by repeating the
binary operations, we can represent a function for multiset input thanks to the
algebraic structure. Naturally, our multiset architecture has
size-generalization ability, which has not been obtained in existing methods.
Further, we present modeling the Abelian group operation itself is useful in a
word analogy task. We train our models over fixed word embeddings and
demonstrate improved performance over the original word2vec and another naive
learning method.
Related papers
- Group and Shuffle: Efficient Structured Orthogonal Parametrization [3.540195249269228]
We introduce a new class of structured matrices, which unifies and generalizes structured classes from previous works.
We empirically validate our method on different domains, including adapting of text-to-image diffusion models and downstream task fine-tuning in language modeling.
arXiv Detail & Related papers (2024-06-14T13:29:36Z) - Lie Group Decompositions for Equivariant Neural Networks [12.139222986297261]
We show how convolution kernels can be parametrized to build models equivariant with respect to affine transformations.
We evaluate the robustness and out-of-distribution generalisation capability of our model on the benchmark affine-invariant classification task.
arXiv Detail & Related papers (2023-10-17T16:04:33Z) - A Recursively Recurrent Neural Network (R2N2) Architecture for Learning
Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms.
We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z) - Equivariant Transduction through Invariant Alignment [71.45263447328374]
We introduce a novel group-equivariant architecture that incorporates a group-in hard alignment mechanism.
We find that our network's structure allows it to develop stronger equivariant properties than existing group-equivariant approaches.
We additionally find that it outperforms previous group-equivariant networks empirically on the SCAN task.
arXiv Detail & Related papers (2022-09-22T11:19:45Z) - Bispectral Neural Networks [1.0323063834827415]
We present a neural network architecture, Bispectral Neural Networks (BNNs)
BNNs are able to simultaneously learn groups, their irreducible representations, and corresponding equivariant and complete-invariant maps.
arXiv Detail & Related papers (2022-09-07T18:34:48Z) - Learning Algebraic Recombination for Compositional Generalization [71.78771157219428]
We propose LeAR, an end-to-end neural model to learn algebraic recombination for compositional generalization.
Key insight is to model the semantic parsing task as a homomorphism between a latent syntactic algebra and a semantic algebra.
Experiments on two realistic and comprehensive compositional generalization demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2021-07-14T07:23:46Z) - Stability of Algebraic Neural Networks to Small Perturbations [179.55535781816343]
Algebraic neural networks (AlgNNs) are composed of a cascade of layers each one associated to and algebraic signal model.
We show how any architecture that uses a formal notion of convolution can be stable beyond particular choices of the shift operator.
arXiv Detail & Related papers (2020-10-22T09:10:16Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.