Related papers: Thermodynamic Bound on Energy and Negentropy Costs of Inference in Deep Neural Networks

Thermodynamic Bound on Energy and Negentropy Costs of Inference in Deep Neural Networks

URL: http://arxiv.org/abs/2503.09980v1
Date: Thu, 13 Mar 2025 02:35:07 GMT
Title: Thermodynamic Bound on Energy and Negentropy Costs of Inference in Deep Neural Networks
Authors: Alexei V. Tkachenko,
Abstract summary: The fundamental thermodynamic bound is derived for the energy cost of inference in Deep Neural Networks (DNNs)<n>We show that the linear operations in DNNs can, in principle, be performed reversibly, whereas the non-linear activation functions impose an unavoidable energy cost.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The fundamental thermodynamic bound is derived for the energy cost of inference in Deep Neural Networks (DNNs). By applying Landauer's principle, we demonstrate that the linear operations in DNNs can, in principle, be performed reversibly, whereas the non-linear activation functions impose an unavoidable energy cost. The resulting theoretical lower bound on the inference energy is determined by the average number of neurons undergoing state transition for each inference. We also restate the thermodynamic bound in terms of negentropy, a metric which is more universal than energy for assessing thermodynamic cost of information processing. Concept of negentropy is further elaborated in the context of information processing in biological and engineered system as well as human intelligence. Our analysis provides insight into the physical limits of DNN efficiency and suggests potential directions for developing energy-efficient AI architectures that leverage reversible analog computing.

Related papers

Architecture of Information [0.0]
The paper explores an approach to constructing energy landscapes of a formal neuron and multilayer artificial neural networks (ANNs) The study of informational and thermodynamic entropy in formal neuron and ANN models leads to the conclusion about the energetic nature of informational entropy. The presented research makes it possible to formulate a formal definition of information in terms of the interaction processes between the internal and external energy of the system.
arXiv Detail & Related papers (2025-03-21T14:48:41Z)
Sustainable AI: Mathematical Foundations of Spiking Neural Networks [46.76155269576732]
Spiking neural networks, inspired by biological neurons, offer a promising alternative with potential computational and energy-efficiency gains.<n>This article examines the computational properties of spiking networks through the lens of learning theory.
arXiv Detail & Related papers (2025-03-03T19:44:12Z)
Thermodynamic computing out of equilibrium [0.0]
We present the design for a thermodynamic computer that can perform arbitrary nonlinear calculations in or out of equilibrium.<n>Simple thermodynamic circuits, fluctuating degrees of freedom in contact with a thermal bath, display an activity that is a nonlinear function of their input.<n>We simulate a digital model of a thermodynamic neural network, and show that its parameters can be adjusted by genetic algorithm to perform nonlinear calculations at specified observation times.
arXiv Detail & Related papers (2024-12-22T22:51:51Z)
DimOL: Dimensional Awareness as A New 'Dimension' in Operator Learning [63.5925701087252]
We introduce DimOL (Dimension-aware Operator Learning), drawing insights from dimensional analysis.<n>To implement DimOL, we propose the ProdLayer, which can be seamlessly integrated into FNO-based and Transformer-based PDE solvers.<n> Empirically, DimOL models achieve up to 48% performance gain within the PDE datasets.
arXiv Detail & Related papers (2024-10-08T10:48:50Z)
Neural Message Passing Induced by Energy-Constrained Diffusion [79.9193447649011]
We propose an energy-constrained diffusion model as a principled interpretable framework for understanding the mechanism of MPNNs. We show that the new model can yield promising performance for cases where the data structures are observed (as a graph), partially observed or completely unobserved.
arXiv Detail & Related papers (2024-09-13T17:54:41Z)
Thermodynamics-Consistent Graph Neural Networks [50.0791489606211]
We propose excess Gibbs free energy graph neural networks (GE-GNNs) for predicting composition-dependent activity coefficients of binary mixtures. The GE-GNN architecture ensures thermodynamic consistency by predicting the molar excess Gibbs free energy. We demonstrate high accuracy and thermodynamic consistency of the activity coefficient predictions.
arXiv Detail & Related papers (2024-07-08T06:58:56Z)
Energy Transformer [64.22957136952725]
Our work combines aspects of three promising paradigms in machine learning, namely, attention mechanism, energy-based models, and associative memory. We propose a novel architecture, called the Energy Transformer (or ET for short), that uses a sequence of attention layers that are purposely designed to minimize a specifically engineered energy function.
arXiv Detail & Related papers (2023-02-14T18:51:22Z)
Geometric Knowledge Distillation: Topology Compression for Graph Neural Networks [80.8446673089281]
We study a new paradigm of knowledge transfer that aims at encoding graph topological information into graph neural networks (GNNs) We propose Neural Heat Kernel (NHK) to encapsulate the geometric property of the underlying manifold concerning the architecture of GNNs. A fundamental and principled solution is derived by aligning NHKs on teacher and student models, dubbed as Geometric Knowledge Distillation.
arXiv Detail & Related papers (2022-10-24T08:01:58Z)
Quantum Foundations of Classical Reversible Computing [0.0]
reversible computing is capable of circumventing the thermodynamic limits to the energy efficiency of the conventional, non-reversible digital paradigm. We use the framework of Gorini-Kossakowski-Sudarshan-Lindblad dynamics (a.k.a Lindbladians) with multiple states, incorporating recent results from resource theory, full counting statistics, and reversible thermodynamics. We also outline a research plan for identifying the fundamental minimum energy dissipation of computing machines as a function of speed.
arXiv Detail & Related papers (2021-04-30T19:53:47Z)
Thermodynamic Consistent Neural Networks for Learning Material Interfacial Mechanics [6.087530833458481]
The traction-separation relations (TSR) quantitatively describe the mechanical behavior of a material interface undergoing openings. A neural network can fit well along with the loading paths but often fails to obey the laws of physics. We propose a thermodynamic consistent neural network (TCNN) approach to build a data-driven model of the TSR with sparse experimental data.
arXiv Detail & Related papers (2020-11-28T17:25:10Z)
Parsimonious neural networks learn interpretable physical laws [77.34726150561087]
We propose parsimonious neural networks (PNNs) that combine neural networks with evolutionary optimization to find models that balance accuracy with parsimony. The power and versatility of the approach is demonstrated by developing models for classical mechanics and to predict the melting temperature of materials from fundamental properties.
arXiv Detail & Related papers (2020-05-08T16:15:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.