Neural Entropy
- URL: http://arxiv.org/abs/2409.03817v1
- Date: Thu, 5 Sep 2024 18:00:00 GMT
- Title: Neural Entropy
- Authors: Akhil Premkumar,
- Abstract summary: We examine the connection between deep learning and information theory through the paradigm of diffusion models.
We characterize the amount of information required to reverse a diffusive process and show that neural networks store this information and operate in a manner reminiscent of Maxwell's demon during the generative stage.
This conceptual picture blends elements of optimal control, thermodynamics, information theory, and optimal transport, and raises the prospect of applying diffusion models as a test bench to understand neural networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We examine the connection between deep learning and information theory through the paradigm of diffusion models. Using well-established principles from non-equilibrium thermodynamics we can characterize the amount of information required to reverse a diffusive process. Neural networks store this information and operate in a manner reminiscent of Maxwell's demon during the generative stage. We illustrate this cycle using a novel diffusion scheme we call the entropy matching model, wherein the information conveyed to the network during training exactly corresponds to the entropy that must be negated during reversal. We demonstrate that this entropy can be used to analyze the encoding efficiency and storage capacity of the network. This conceptual picture blends elements of stochastic optimal control, thermodynamics, information theory, and optimal transport, and raises the prospect of applying diffusion models as a test bench to understand neural networks.
Related papers
- Neural Message Passing Induced by Energy-Constrained Diffusion [79.9193447649011]
We propose an energy-constrained diffusion model as a principled interpretable framework for understanding the mechanism of MPNNs.
We show that the new model can yield promising performance for cases where the data structures are observed (as a graph), partially observed or completely unobserved.
arXiv Detail & Related papers (2024-09-13T17:54:41Z) - Speed-accuracy relations for the diffusion models: Wisdom from nonequilibrium thermodynamics and optimal transport [0.0]
We discuss a connection between a generative model, called the diffusion model, and nonequilibrium thermodynamics.
We numerically illustrate the validity of the speed-accuracy relations for the diffusion models with different noise schedules and the different data.
We also show the inaccurate data generation due to the non-conservative force, and the applicability of our results to data generation from the real-world image datasets.
arXiv Detail & Related papers (2024-07-05T13:35:14Z) - Predicting Cascading Failures with a Hyperparametric Diffusion Model [66.89499978864741]
We study cascading failures in power grids through the lens of diffusion models.
Our model integrates viral diffusion principles with physics-based concepts.
We show that this diffusion model can be learned from traces of cascading failures.
arXiv Detail & Related papers (2024-06-12T02:34:24Z) - Assessing Neural Network Representations During Training Using
Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process.
We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures.
We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z) - Information-Theoretic GAN Compression with Variational Energy-based
Model [36.77535324130402]
We propose an information-theoretic knowledge distillation approach for the compression of generative adversarial networks.
We show that the proposed algorithm achieves outstanding performance in model compression of generative adversarial networks consistently.
arXiv Detail & Related papers (2023-03-28T15:32:21Z) - Towards quantifying information flows: relative entropy in deep neural
networks and the renormalization group [0.0]
We quantify the flow of information by explicitly computing the relative entropy or Kullback-Leibler divergence.
For the neural networks, the behavior may have implications for various information methods in machine learning.
arXiv Detail & Related papers (2021-07-14T18:00:01Z) - Physics perception in sloshing scenes with guaranteed thermodynamic
consistency [0.0]
We propose a strategy to learn the full state of sloshing liquids from measurements of the free surface.
Our approach is based on recurrent neural networks (RNN) that project the limited information available to a reduced-order manifold.
arXiv Detail & Related papers (2021-06-24T20:13:56Z) - Influence Estimation and Maximization via Neural Mean-Field Dynamics [60.91291234832546]
We propose a novel learning framework using neural mean-field (NMF) dynamics for inference and estimation problems.
Our framework can simultaneously learn the structure of the diffusion network and the evolution of node infection probabilities.
arXiv Detail & Related papers (2021-06-03T00:02:05Z) - Deep learning of thermodynamics-aware reduced-order models from data [0.08699280339422537]
We present an algorithm to learn the relevant latent variables of a large-scale discretized physical system.
We then predict its time evolution using thermodynamically-consistent deep neural networks.
arXiv Detail & Related papers (2020-07-03T08:49:01Z) - Network Diffusions via Neural Mean-Field Dynamics [52.091487866968286]
We propose a novel learning framework for inference and estimation problems of diffusion on networks.
Our framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities.
Our approach is versatile and robust to variations of the underlying diffusion network models.
arXiv Detail & Related papers (2020-06-16T18:45:20Z) - Focus of Attention Improves Information Transfer in Visual Features [80.22965663534556]
This paper focuses on unsupervised learning for transferring visual information in a truly online setting.
The computation of the entropy terms is carried out by a temporal process which yields online estimation of the entropy terms.
In order to better structure the input probability distribution, we use a human-like focus of attention model.
arXiv Detail & Related papers (2020-06-16T15:07:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.