Transferable Neural Wavefunctions for Solids
- URL: http://arxiv.org/abs/2405.07599v1
- Date: Mon, 13 May 2024 09:59:59 GMT
- Title: Transferable Neural Wavefunctions for Solids
- Authors: Leon Gerard, Michael Scherbela, Halvard Sutterud, Matthew Foulkes, Philipp Grohs,
- Abstract summary: We show how to optimize a single ansatz across all of these variations.
We successfully transfer a network, pre-trained on 2x2x2 supercells of LiH, to 3x3x3 supercells.
- Score: 5.219568203653524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep-Learning-based Variational Monte Carlo (DL-VMC) has recently emerged as a highly accurate approach for finding approximate solutions to the many-electron Schr\"odinger equation. Despite its favorable scaling with the number of electrons, $\mathcal{O}(n_\text{el}^{4})$, the practical value of DL-VMC is limited by the high cost of optimizing the neural network weights for every system studied. To mitigate this problem, recent research has proposed optimizing a single neural network across multiple systems, reducing the cost per system. Here we extend this approach to solids, where similar but distinct calculations using different geometries, boundary conditions, and supercell sizes are often required. We show how to optimize a single ansatz across all of these variations, reducing the required number of optimization steps by an order of magnitude. Furthermore, we exploit the transfer capabilities of a pre-trained network. We successfully transfer a network, pre-trained on 2x2x2 supercells of LiH, to 3x3x3 supercells. This reduces the number of optimization steps required to simulate the large system by a factor of 50 compared to previous work.
Related papers
- Fast and scalable Wasserstein-1 neural optimal transport solver for single-cell perturbation prediction [55.89763969583124]
Optimal transport theory provides a principled framework for constructing such mappings.
We propose a novel optimal transport solver based on Wasserstein-1.
Our experiments demonstrate that the proposed solver can mimic the $W$ OT solvers in finding a unique and monotonic" map on 2D datasets.
arXiv Detail & Related papers (2024-11-01T14:23:19Z) - Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization [0.0]
We present Double Sparse Factorization (DSF), where we factorize each weight matrix into two sparse matrices.
Our method achieves state-of-the-art results, enabling unprecedented sparsification of neural networks.
arXiv Detail & Related papers (2024-09-27T15:48:39Z) - Geometry-Informed Neural Operator for Large-Scale 3D PDEs [76.06115572844882]
We propose the geometry-informed neural operator (GINO) to learn the solution operator of large-scale partial differential equations.
We successfully trained GINO to predict the pressure on car surfaces using only five hundred data points.
arXiv Detail & Related papers (2023-09-01T16:59:21Z) - Speed Limits for Deep Learning [67.69149326107103]
Recent advancement in thermodynamics allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network.
We provide analytical expressions for these speed limits for linear and linearizable neural networks.
Remarkably, given some plausible scaling assumptions on the NTK spectra and spectral decomposition of the labels -- learning is optimal in a scaling sense.
arXiv Detail & Related papers (2023-07-27T06:59:46Z) - A Deep Learning algorithm to accelerate Algebraic Multigrid methods in
Finite Element solvers of 3D elliptic PDEs [0.0]
We introduce a novel Deep Learning algorithm that minimizes the computational cost of the Algebraic multigrid method when used as a finite element solver.
We experimentally prove that the pooling successfully reduces the computational cost of processing a large sparse matrix and preserves the features needed for the regression task at hand.
arXiv Detail & Related papers (2023-04-21T09:18:56Z) - Low-complexity Approximate Convolutional Neural Networks [1.7368964547487395]
We present an approach for minimizing the computational complexity of trained Convolutional Neural Networks (ConvNet)
The idea is to approximate all elements of a given ConvNet with efficient approximations capable of extreme reductions in computational complexity.
Such low-complexity structures pave the way for low-power, efficient hardware designs.
arXiv Detail & Related papers (2022-07-29T21:59:29Z) - Combinatorial optimization for low bit-width neural networks [23.466606660363016]
Low-bit width neural networks have been extensively explored for deployment on edge devices to reduce computational resources.
Existing approaches have focused on gradient-based optimization in a two-stage train-and-compress setting.
We show that a combination of greedy coordinate descent and this novel approach can attain competitive accuracy on binary classification tasks.
arXiv Detail & Related papers (2022-06-04T15:02:36Z) - Hierarchical autoregressive neural networks for statistical systems [0.05156484100374058]
We propose a hierarchical association of physical degrees of freedom, for instance spins, to neurons which replaces it with the scaling with the linear extent $L$ of the system.
We demonstrate our approach on the two-dimensional Ising model by simulating lattices of various sizes up to $128 times 128$ spins, with time benchmarks reaching lattices of size $512 times 512$.
arXiv Detail & Related papers (2022-03-21T13:55:53Z) - Joint inference and input optimization in equilibrium networks [68.63726855991052]
deep equilibrium model is a class of models that foregoes traditional network depth and instead computes the output of a network by finding the fixed point of a single nonlinear layer.
We show that there is a natural synergy between these two settings.
We demonstrate this strategy on various tasks such as training generative models while optimizing over latent codes, training models for inverse problems like denoising and inpainting, adversarial training and gradient based meta-learning.
arXiv Detail & Related papers (2021-11-25T19:59:33Z) - DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and
Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels.
We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.