The autoregressive neural network architecture of the Boltzmann distribution of pairwise interacting spins systems
- URL: http://arxiv.org/abs/2302.08347v3
- Date: Mon, 25 Mar 2024 13:18:39 GMT
- Title: The autoregressive neural network architecture of the Boltzmann distribution of pairwise interacting spins systems
- Authors: Indaco Biazzo,
- Abstract summary: Generative Autoregressive Neural Networks (ARNNs) have recently demonstrated exceptional results in image and language generation tasks.
This work presents an exact mapping of the Boltzmann distribution of binary pairwise interacting systems into autoregressive form.
The resulting ARNN architecture has weights and biases of its first layer corresponding to the Hamiltonian's couplings and external fields.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative Autoregressive Neural Networks (ARNNs) have recently demonstrated exceptional results in image and language generation tasks, contributing to the growing popularity of generative models in both scientific and commercial applications. This work presents an exact mapping of the Boltzmann distribution of binary pairwise interacting systems into autoregressive form. The resulting ARNN architecture has weights and biases of its first layer corresponding to the Hamiltonian's couplings and external fields, featuring widely used structures such as the residual connections and a recurrent architecture with clear physical meanings. Moreover, its architecture's explicit formulation enables the use of statistical physics techniques to derive new ARNNs for specific systems. As examples, new effective ARNN architectures are derived from two well-known mean-field systems, the Curie-Weiss and Sherrington-Kirkpatrick models, showing superior performance in approximating the Boltzmann distributions of the corresponding physics model compared to other commonly used architectures. The connection established between the physics of the system and the neural network architecture provides a means to derive new architectures for different interacting systems and interpret existing ones from a physical perspective.
Related papers
- Systematic construction of continuous-time neural networks for linear dynamical systems [0.0]
We discuss a systematic approach to constructing neural architectures for modeling a subclass of dynamical systems.
We use a variant of continuous-time neural networks in which the output of each neuron evolves continuously as a solution of a first-order or second-order Ordinary Differential Equation (ODE)
Instead of deriving the network architecture and parameters from data, we propose a gradient-free algorithm to compute sparse architecture and network parameters directly from the given LTI system.
arXiv Detail & Related papers (2024-03-24T16:16:41Z) - Multi-conditioned Graph Diffusion for Neural Architecture Search [8.290336491323796]
We present a graph diffusion-based NAS approach that uses discrete conditional graph diffusion processes to generate high-performing neural network architectures.
We show promising results on six standard benchmarks, yielding novel and unique architectures at a fast speed.
arXiv Detail & Related papers (2024-03-09T21:45:31Z) - A predictive physics-aware hybrid reduced order model for reacting flows [65.73506571113623]
A new hybrid predictive Reduced Order Model (ROM) is proposed to solve reacting flow problems.
The number of degrees of freedom is reduced from thousands of temporal points to a few POD modes with their corresponding temporal coefficients.
Two different deep learning architectures have been tested to predict the temporal coefficients.
arXiv Detail & Related papers (2023-01-24T08:39:20Z) - Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design.
Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Self-Learning for Received Signal Strength Map Reconstruction with
Neural Architecture Search [63.39818029362661]
We present a model based on Neural Architecture Search (NAS) and self-learning for received signal strength ( RSS) map reconstruction.
The approach first finds an optimal NN architecture and simultaneously train the deduced model over some ground-truth measurements of a given ( RSS) map.
Experimental results show that signal predictions of this second model outperforms non-learning based state-of-the-art techniques and NN models with no architecture search.
arXiv Detail & Related papers (2021-05-17T12:19:22Z) - Adversarially Robust Neural Architectures [43.74185132684662]
This paper aims to improve the adversarial robustness of the network from the architecture perspective with NAS framework.
We explore the relationship among adversarial robustness, Lipschitz constant, and architecture parameters.
Our algorithm empirically achieves the best performance among all the models under various attacks on different datasets.
arXiv Detail & Related papers (2020-09-02T08:52:15Z) - Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks [61.76338096980383]
A range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper- parameters of state-of-the-art factored time delay neural networks (TDNNs)
These include the DARTS method integrating architecture selection with lattice-free MMI (LF-MMI) TDNN training.
Experiments conducted on a 300-hour Switchboard corpus suggest the auto-configured systems consistently outperform the baseline LF-MMI TDNN systems.
arXiv Detail & Related papers (2020-07-17T08:32:11Z) - Neural Architecture Optimization with Graph VAE [21.126140965779534]
We propose an efficient NAS approach to optimize network architectures in a continuous space.
The framework jointly learns four components: the encoder, the performance predictor, the complexity predictor and the decoder.
arXiv Detail & Related papers (2020-06-18T07:05:48Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.