Generalization capabilities of translationally equivariant neural
networks
- URL: http://arxiv.org/abs/2103.14686v1
- Date: Fri, 26 Mar 2021 18:53:36 GMT
- Title: Generalization capabilities of translationally equivariant neural
networks
- Authors: Srinath Bulusu, Matteo Favoni, Andreas Ipp, David I. M\"uller, Daniel
Schuh
- Abstract summary: In this work, we focus on complex scalar field theory on a two-dimensional lattice and investigate the benefits of using group equivariant convolutional neural network architectures.
For a meaningful comparison, we conduct a systematic search for equivariant and non-equivariant neural network architectures and apply them to various regression and classification tasks.
We demonstrate that our best equivariant architectures can perform and generalize significantly better than their non-equivariant counterparts.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rising adoption of machine learning in high energy physics and lattice
field theory necessitates the re-evaluation of common methods that are widely
used in computer vision, which, when applied to problems in physics, can lead
to significant drawbacks in terms of performance and generalizability. One
particular example for this is the use of neural network architectures that do
not reflect the underlying symmetries of the given physical problem. In this
work, we focus on complex scalar field theory on a two-dimensional lattice and
investigate the benefits of using group equivariant convolutional neural
network architectures based on the translation group. For a meaningful
comparison, we conduct a systematic search for equivariant and non-equivariant
neural network architectures and apply them to various regression and
classification tasks. We demonstrate that in most of these tasks our best
equivariant architectures can perform and generalize significantly better than
their non-equivariant counterparts, which applies not only to physical
parameters beyond those represented in the training set, but also to different
lattice sizes.
Related papers
- Equivariant Neural Tangent Kernels [2.373992571236766]
We give explicit expressions for neural tangent kernels (NTKs) of group convolutional neural networks.
In numerical experiments, we demonstrate superior performance for equivariant NTKs over non-equivariant NTKs on a classification task for medical images.
arXiv Detail & Related papers (2024-06-10T17:43:13Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - A Characterization Theorem for Equivariant Networks with Point-wise
Activations [13.00676132572457]
We prove that rotation-equivariant networks can only be invariant, as it happens for any network which is equivariant with respect to connected compact groups.
We show that feature spaces of disentangled steerable convolutional neural networks are trivial representations.
arXiv Detail & Related papers (2024-01-17T14:30:46Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Unifying O(3) Equivariant Neural Networks Design with Tensor-Network Formalism [12.008737454250463]
We propose using fusion diagrams, a technique widely employed in simulating SU($2$)-symmetric quantum many-body problems, to design new equivariant components for equivariant neural networks.
When applied to particles within a given local neighborhood, the resulting components, which we term "fusion blocks," serve as universal approximators of any continuous equivariant function.
Our approach, which combines tensor networks with equivariant neural networks, suggests a potentially fruitful direction for designing more expressive equivariant neural networks.
arXiv Detail & Related papers (2022-11-14T16:06:59Z) - PELICAN: Permutation Equivariant and Lorentz Invariant or Covariant
Aggregator Network for Particle Physics [64.5726087590283]
We present a machine learning architecture that uses a set of inputs maximally reduced with respect to the full 6-dimensional Lorentz symmetry.
We show that the resulting network outperforms all existing competitors despite much lower model complexity.
arXiv Detail & Related papers (2022-11-01T13:36:50Z) - Equivariance and generalization in neural networks [0.0]
We focus on the consequences of incorporating translational equivariance among the network properties.
The benefits of equivariant networks are exemplified by studying a complex scalar field theory.
In most of the tasks our best equivariant architectures can perform and generalize significantly better than their non-equivariant counterparts.
arXiv Detail & Related papers (2021-12-23T12:38:32Z) - Generalization capabilities of neural networks in lattice applications [0.0]
We investigate the advantages of adopting translationally equivariant neural networks in favor of non-equivariant ones.
We show that our best equivariant architectures can perform and generalize significantly better than their non-equivariant counterparts.
arXiv Detail & Related papers (2021-12-23T11:48:06Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types.
We show that FA-based models have maximal expressive power in a broad setting.
We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.