Diagonal Symmetrization of Neural Network Solvers for the Many-Electron Schrödinger Equation
- URL: http://arxiv.org/abs/2502.05318v1
- Date: Fri, 07 Feb 2025 20:37:25 GMT
- Title: Diagonal Symmetrization of Neural Network Solvers for the Many-Electron Schrödinger Equation
- Authors: Kevin Han Huang, Ni Zhan, Elif Ertekin, Peter Orbanz, Ryan P. Adams,
- Abstract summary: We study different ways of incorporating diagonal invariance in neural network ans"atze trained via variational Monte Carlo methods.
We show that, contrary to standard ML setups, in-training symmetrization destabilizes training and can lead to worse performance.
Our theoretical and numerical results indicate that this unexpected behavior may arise from a unique computational-statistical tradeoff not found in standard ML analyses of symmetrization.
- Score: 11.202098800341096
- License:
- Abstract: Incorporating group symmetries into neural networks has been a cornerstone of success in many AI-for-science applications. Diagonal groups of isometries, which describe the invariance under a simultaneous movement of multiple objects, arise naturally in many-body quantum problems. Despite their importance, diagonal groups have received relatively little attention, as they lack a natural choice of invariant maps except in special cases. We study different ways of incorporating diagonal invariance in neural network ans\"atze trained via variational Monte Carlo methods, and consider specifically data augmentation, group averaging and canonicalization. We show that, contrary to standard ML setups, in-training symmetrization destabilizes training and can lead to worse performance. Our theoretical and numerical results indicate that this unexpected behavior may arise from a unique computational-statistical tradeoff not found in standard ML analyses of symmetrization. Meanwhile, we demonstrate that post hoc averaging is less sensitive to such tradeoffs and emerges as a simple, flexible and effective method for improving neural network solvers.
Related papers
- Symmetry Discovery for Different Data Types [52.2614860099811]
Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance.
We propose LieSD, a method for discovering symmetries via trained neural networks which approximate the input-output mappings of the tasks.
We validate the performance of LieSD on tasks with symmetries such as the two-body problem, the moment of inertia matrix prediction, and top quark tagging.
arXiv Detail & Related papers (2024-10-13T13:39:39Z) - Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - Trade-Offs of Diagonal Fisher Information Matrix Estimators [53.35448232352667]
The Fisher information matrix can be used to characterize the local geometry of the parameter space of neural networks.
We examine two popular estimators whose accuracy and sample complexity depend on their associated variances.
We derive bounds of the variances and instantiate them in neural networks for regression and classification.
arXiv Detail & Related papers (2024-02-08T03:29:10Z) - On the hardness of learning under symmetries [31.961154082757798]
We study the problem of learning equivariant neural networks via gradient descent.
In spite of the inductive bias via symmetry, actually learning the complete classes of functions represented by equivariant neural networks via gradient descent remains hard.
arXiv Detail & Related papers (2024-01-03T18:24:18Z) - Symmetry Breaking and Equivariant Neural Networks [17.740760773905986]
We introduce a novel notion of'relaxed equiinjection'
We show how to incorporate this relaxation into equivariant multilayer perceptronrons (E-MLPs)
The relevance of symmetry breaking is then discussed in various application domains.
arXiv Detail & Related papers (2023-12-14T15:06:48Z) - Revisiting Gaussian Neurons for Online Clustering with Unknown Number of
Clusters [0.0]
A novel local learning rule is presented that performs online clustering with a maximum limit of the number of cluster to be found.
The experimental results demonstrate stability in the learned parameters across a large number of training samples.
arXiv Detail & Related papers (2022-05-02T14:01:40Z) - Learning Invariant Weights in Neural Networks [16.127299898156203]
Many commonly used models in machine learning are constraint to respect certain symmetries in the data.
We propose a weight-space equivalent to this approach, by minimizing a lower bound on the marginal likelihood to learn invariances in neural networks.
arXiv Detail & Related papers (2022-02-25T00:17:09Z) - Lattice gauge symmetry in neural networks [0.0]
We review a novel neural network architecture called lattice gauge equivariant convolutional neural networks (L-CNNs)
We discuss the concept of gauge equivariance which we use to explicitly construct a gauge equivariant convolutional layer and a bilinear layer.
The performance of L-CNNs and non-equivariant CNNs is compared using seemingly simple non-linear regression tasks.
arXiv Detail & Related papers (2021-11-08T11:20:11Z) - Convolutional Filtering and Neural Networks with Non Commutative
Algebras [153.20329791008095]
We study the generalization of non commutative convolutional neural networks.
We show that non commutative convolutional architectures can be stable to deformations on the space of operators.
arXiv Detail & Related papers (2021-08-23T04:22:58Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z) - Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms.
We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.