Thermodynamic Bound on Energy and Negentropy Costs of Inference in Deep Neural Networks
- URL: http://arxiv.org/abs/2503.09980v1
- Date: Thu, 13 Mar 2025 02:35:07 GMT
- Title: Thermodynamic Bound on Energy and Negentropy Costs of Inference in Deep Neural Networks
- Authors: Alexei V. Tkachenko,
- Abstract summary: The fundamental thermodynamic bound is derived for the energy cost of inference in Deep Neural Networks (DNNs)<n>We show that the linear operations in DNNs can, in principle, be performed reversibly, whereas the non-linear activation functions impose an unavoidable energy cost.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The fundamental thermodynamic bound is derived for the energy cost of inference in Deep Neural Networks (DNNs). By applying Landauer's principle, we demonstrate that the linear operations in DNNs can, in principle, be performed reversibly, whereas the non-linear activation functions impose an unavoidable energy cost. The resulting theoretical lower bound on the inference energy is determined by the average number of neurons undergoing state transition for each inference. We also restate the thermodynamic bound in terms of negentropy, a metric which is more universal than energy for assessing thermodynamic cost of information processing. Concept of negentropy is further elaborated in the context of information processing in biological and engineered system as well as human intelligence. Our analysis provides insight into the physical limits of DNN efficiency and suggests potential directions for developing energy-efficient AI architectures that leverage reversible analog computing.
Related papers
- Fractional Spike Differential Equations Neural Network with Efficient Adjoint Parameters Training [63.3991315762955]
Spiking Neural Networks (SNNs) draw inspiration from biological neurons to create realistic models for brain-like computation.<n>Most existing SNNs assume a single time constant for neuronal membrane voltage dynamics, modeled by first-order ordinary differential equations (ODEs) with Markovian characteristics.<n>We propose the Fractional SPIKE Differential Equation neural network (fspikeDE), which captures long-term dependencies in membrane voltage and spike trains through fractional-order dynamics.
arXiv Detail & Related papers (2025-07-22T18:20:56Z) - Architecture of Information [0.0]
The paper explores an approach to constructing energy landscapes of a formal neuron and multilayer artificial neural networks (ANNs)
The study of informational and thermodynamic entropy in formal neuron and ANN models leads to the conclusion about the energetic nature of informational entropy.
The presented research makes it possible to formulate a formal definition of information in terms of the interaction processes between the internal and external energy of the system.
arXiv Detail & Related papers (2025-03-21T14:48:41Z) - Sustainable AI: Mathematical Foundations of Spiking Neural Networks [46.76155269576732]
Spiking neural networks, inspired by biological neurons, offer a promising alternative with potential computational and energy-efficiency gains.<n>This article examines the computational properties of spiking networks through the lens of learning theory.
arXiv Detail & Related papers (2025-03-03T19:44:12Z) - Thermodynamic computing out of equilibrium [0.0]
We present the design for a thermodynamic computer that can perform arbitrary nonlinear calculations in or out of equilibrium.<n>Simple thermodynamic circuits, fluctuating degrees of freedom in contact with a thermal bath, display an activity that is a nonlinear function of their input.<n>We simulate a digital model of a thermodynamic neural network, and show that its parameters can be adjusted by genetic algorithm to perform nonlinear calculations at specified observation times.
arXiv Detail & Related papers (2024-12-22T22:51:51Z) - Physics-Informed Regularization for Domain-Agnostic Dynamical System Modeling [41.82469276824927]
We present a framework that achieves high-precision modeling for a wide range of dynamical systems.
It helps preserve energies for conservative systems while serving as a strong inductive bias for non-conservative, reversible systems.
By integrating the TRS loss within neural ordinary differential equation models, the proposed model TREAT demonstrates superior performance on diverse physical systems.
arXiv Detail & Related papers (2024-10-08T21:04:01Z) - DimOL: Dimensional Awareness as A New 'Dimension' in Operator Learning [63.5925701087252]
We introduce DimOL (Dimension-aware Operator Learning), drawing insights from dimensional analysis.<n>To implement DimOL, we propose the ProdLayer, which can be seamlessly integrated into FNO-based and Transformer-based PDE solvers.<n> Empirically, DimOL models achieve up to 48% performance gain within the PDE datasets.
arXiv Detail & Related papers (2024-10-08T10:48:50Z) - Contrastive Learning in Memristor-based Neuromorphic Systems [55.11642177631929]
Spiking neural networks have become an important family of neuron-based models that sidestep many of the key limitations facing modern-day backpropagation-trained deep networks.
In this work, we design and investigate a proof-of-concept instantiation of contrastive-signal-dependent plasticity (CSDP), a neuromorphic form of forward-forward-based, backpropagation-free learning.
arXiv Detail & Related papers (2024-09-17T04:48:45Z) - Neural Message Passing Induced by Energy-Constrained Diffusion [79.9193447649011]
We propose an energy-constrained diffusion model as a principled interpretable framework for understanding the mechanism of MPNNs.
We show that the new model can yield promising performance for cases where the data structures are observed (as a graph), partially observed or completely unobserved.
arXiv Detail & Related papers (2024-09-13T17:54:41Z) - Towards training digitally-tied analog blocks via hybrid gradient computation [1.800676987432211]
We introduce Feedforward-tied Energy-based Models (ff-EBMs)
We derive a novel algorithm to compute gradients end-to-end in ff-EBMs by backpropagating and "eq-propagating" through feedforward and energy-based parts respectively.
Our approach offers a principled, scalable, and incremental roadmap to gradually integrate self-trainable analog computational primitives into existing digital accelerators.
arXiv Detail & Related papers (2024-09-05T07:22:19Z) - Thermodynamics-Consistent Graph Neural Networks [50.0791489606211]
We propose excess Gibbs free energy graph neural networks (GE-GNNs) for predicting composition-dependent activity coefficients of binary mixtures.
The GE-GNN architecture ensures thermodynamic consistency by predicting the molar excess Gibbs free energy.
We demonstrate high accuracy and thermodynamic consistency of the activity coefficient predictions.
arXiv Detail & Related papers (2024-07-08T06:58:56Z) - TANGO: Time-Reversal Latent GraphODE for Multi-Agent Dynamical Systems [43.39754726042369]
We propose a simple-yet-effective self-supervised regularization term as a soft constraint that aligns the forward and backward trajectories predicted by a continuous graph neural network-based ordinary differential equation (GraphODE)
It effectively imposes time-reversal symmetry to enable more accurate model predictions across a wider range of dynamical systems under classical mechanics.
Experimental results on a variety of physical systems demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2023-10-10T08:52:16Z) - Energy Transformer [64.22957136952725]
Our work combines aspects of three promising paradigms in machine learning, namely, attention mechanism, energy-based models, and associative memory.
We propose a novel architecture, called the Energy Transformer (or ET for short), that uses a sequence of attention layers that are purposely designed to minimize a specifically engineered energy function.
arXiv Detail & Related papers (2023-02-14T18:51:22Z) - Physically Consistent Neural ODEs for Learning Multi-Physics Systems [0.0]
In this paper, we leverage the framework of Irreversible port-Hamiltonian Systems (IPHS), which can describe most multi-physics systems.
We propose Physically Consistent NODEs (PC-NODEs) to learn parameters from data.
We demonstrate the effectiveness of the proposed method by learning the thermodynamics of a building from the real-world measurements.
arXiv Detail & Related papers (2022-11-11T11:20:35Z) - Geometric Knowledge Distillation: Topology Compression for Graph Neural
Networks [80.8446673089281]
We study a new paradigm of knowledge transfer that aims at encoding graph topological information into graph neural networks (GNNs)
We propose Neural Heat Kernel (NHK) to encapsulate the geometric property of the underlying manifold concerning the architecture of GNNs.
A fundamental and principled solution is derived by aligning NHKs on teacher and student models, dubbed as Geometric Knowledge Distillation.
arXiv Detail & Related papers (2022-10-24T08:01:58Z) - Constructing Neural Network-Based Models for Simulating Dynamical
Systems [59.0861954179401]
Data-driven modeling is an alternative paradigm that seeks to learn an approximation of the dynamics of a system using observations of the true system.
This paper provides a survey of the different ways to construct models of dynamical systems using neural networks.
In addition to the basic overview, we review the related literature and outline the most significant challenges from numerical simulations that this modeling paradigm must overcome.
arXiv Detail & Related papers (2021-11-02T10:51:42Z) - Quantum Foundations of Classical Reversible Computing [0.0]
reversible computing is capable of circumventing the thermodynamic limits to the energy efficiency of the conventional, non-reversible digital paradigm.
We use the framework of Gorini-Kossakowski-Sudarshan-Lindblad dynamics (a.k.a Lindbladians) with multiple states, incorporating recent results from resource theory, full counting statistics, and reversible thermodynamics.
We also outline a research plan for identifying the fundamental minimum energy dissipation of computing machines as a function of speed.
arXiv Detail & Related papers (2021-04-30T19:53:47Z) - Thermodynamic Consistent Neural Networks for Learning Material
Interfacial Mechanics [6.087530833458481]
The traction-separation relations (TSR) quantitatively describe the mechanical behavior of a material interface undergoing openings.
A neural network can fit well along with the loading paths but often fails to obey the laws of physics.
We propose a thermodynamic consistent neural network (TCNN) approach to build a data-driven model of the TSR with sparse experimental data.
arXiv Detail & Related papers (2020-11-28T17:25:10Z) - Training End-to-End Analog Neural Networks with Equilibrium Propagation [64.0476282000118]
We introduce a principled method to train end-to-end analog neural networks by gradient descent.
We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models.
Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
arXiv Detail & Related papers (2020-06-02T23:38:35Z) - Thermodynamics-based Artificial Neural Networks for constitutive
modeling [0.0]
We propose a new class of data-driven, physics-based, neural networks for modeling of strain rate independent processes at the material point level.
The two basic principles of thermodynamics are encoded in the network's architecture by taking advantage of automatic differentiation.
We demonstrate the wide applicability of TANNs for modeling elasto-plastic materials, with strain hardening and softening strain.
arXiv Detail & Related papers (2020-05-25T15:56:34Z) - Parsimonious neural networks learn interpretable physical laws [77.34726150561087]
We propose parsimonious neural networks (PNNs) that combine neural networks with evolutionary optimization to find models that balance accuracy with parsimony.
The power and versatility of the approach is demonstrated by developing models for classical mechanics and to predict the melting temperature of materials from fundamental properties.
arXiv Detail & Related papers (2020-05-08T16:15:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.