Interpreting convolutional neural networks' low dimensional
approximation to quantum spin systems
- URL: http://arxiv.org/abs/2210.00692v1
- Date: Mon, 3 Oct 2022 02:49:16 GMT
- Title: Interpreting convolutional neural networks' low dimensional
approximation to quantum spin systems
- Authors: Yilong Ju, Shah Saad Alam, Jonathan Minoff, Fabio Anselmi, Han Pu,
Ankit Patel
- Abstract summary: Convolutional neural networks (CNNs) have been employed along with Variational Monte Carlo methods for finding the ground state of quantum many-body spin systems.
We provide a theoretical and experimental analysis of how the CNN optimize learning for spin systems, and investigate the CNN's low dimensional approximation.
Our results allow us to gain a comprehensive, improved understanding of how CNNs successfully approximate quantum spin Hamiltonians.
- Score: 1.631115063641726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional neural networks (CNNs) have been employed along with
Variational Monte Carlo methods for finding the ground state of quantum
many-body spin systems with great success. In order to do so, however, a CNN
with only linearly many variational parameters has to circumvent the ``curse of
dimensionality'' and successfully approximate a wavefunction on an
exponentially large Hilbert space. In our work, we provide a theoretical and
experimental analysis of how the CNN optimizes learning for spin systems, and
investigate the CNN's low dimensional approximation. We first quantify the role
played by physical symmetries of the underlying spin system during training. We
incorporate our insights into a new training algorithm and demonstrate its
improved efficiency, accuracy and robustness. We then further investigate the
CNN's ability to approximate wavefunctions by looking at the entanglement
spectrum captured by the size of the convolutional filter. Our insights reveal
the CNN to be an ansatz fundamentally centered around the occurrence statistics
of $K$-motifs of the input strings. We use this motivation to provide the
shallow CNN ansatz with a unifying theoretical interpretation in terms of other
well-known statistical and physical ansatzes such as the maximum entropy
(MaxEnt) and entangled plaquette correlator product states (EP-CPS). Using
regression analysis, we find further relationships between the CNN's
approximations of the different motifs' expectation values. Our results allow
us to gain a comprehensive, improved understanding of how CNNs successfully
approximate quantum spin Hamiltonians and to use that understanding to improve
CNN performance.
Related papers
- On the rates of convergence for learning with convolutional neural networks [9.772773527230134]
We study approximation and learning capacities of convolutional neural networks (CNNs) with one-side zero-padding and multiple channels.
We derive convergence rates for estimators based on CNNs in many learning problems.
It is also shown that the obtained rates for classification are minimax optimal in some common settings.
arXiv Detail & Related papers (2024-03-25T06:42:02Z) - Forecasting Fold Bifurcations through Physics-Informed Convolutional
Neural Networks [0.0]
This study proposes a physics-informed convolutional neural network (CNN) for identifying dynamical systems' time series near a fold bifurcation.
The CNN is trained with a relatively small amount of data and on a single, very simple system.
A similar task requires significant extrapolation capabilities, which are obtained by exploiting physics-based information.
arXiv Detail & Related papers (2023-12-21T10:07:52Z) - Continuous approximation by convolutional neural networks with a
sigmoidal function [0.0]
We present a class of convolutional neural networks (CNNs) called non-overlapping CNNs.
We prove that such networks with sigmoidal activation function are capable of approximating arbitrary continuous function defined on compact input sets with any desired degree of accuracy.
arXiv Detail & Related papers (2022-09-27T12:31:36Z) - What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime.
We prove that deep CNNs adapt to the spatial scale of the target function.
We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z) - Separation of scales and a thermodynamic description of feature learning
in some CNNs [2.28438857884398]
Deep neural networks (DNNs) are powerful tools for compressing and distilling information.
A common strategy in such cases is to identify slow degrees of freedom that average out the erratic behavior of the underlying fast microscopic variables.
Here, we identify such a separation of scales occurring in over- parameterized deep convolutional neural networks (CNNs) at the end of training.
The resulting thermodynamic theory of deep learning yields accurate predictions on several deep non-linear CNN toy models.
arXiv Detail & Related papers (2021-12-31T10:49:55Z) - Classification of diffraction patterns using a convolutional neural
network in single particle imaging experiments performed at X-ray
free-electron lasers [53.65540150901678]
Single particle imaging (SPI) at X-ray free electron lasers (XFELs) is particularly well suited to determine the 3D structure of particles in their native environment.
For a successful reconstruction, diffraction patterns originating from a single hit must be isolated from a large number of acquired patterns.
We propose to formulate this task as an image classification problem and solve it using convolutional neural network (CNN) architectures.
arXiv Detail & Related papers (2021-12-16T17:03:14Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - An Information-theoretic Visual Analysis Framework for Convolutional
Neural Networks [11.15523311079383]
We introduce a data model to organize the data that can be extracted from CNN models.
We then propose two ways to calculate entropy under different circumstances.
We develop a visual analysis system, CNNSlicer, to interactively explore the amount of information changes inside the model.
arXiv Detail & Related papers (2020-05-02T21:36:50Z) - What Deep CNNs Benefit from Global Covariance Pooling: An Optimization
Perspective [102.37204254403038]
We make an attempt to understand what deep CNNs benefit from GCP in a viewpoint of optimization.
We show that GCP can make the optimization landscape more smooth and the gradients more predictive.
We conduct extensive experiments using various deep CNN models on diversified tasks, and the results provide strong support to our findings.
arXiv Detail & Related papers (2020-03-25T07:00:45Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.