Metabolic cost of information processing in Poisson variational autoencoders
- URL: http://arxiv.org/abs/2602.13421v1
- Date: Fri, 13 Feb 2026 19:46:11 GMT
- Title: Metabolic cost of information processing in Poisson variational autoencoders
- Authors: Hadi Vafaii, Jacob L. Yates,
- Abstract summary: We argue that variational free energy minimization offers a principled path toward an energy-aware theory of computation.<n>Our key observation is that the Kullback-Leibler (KL) divergence term in the Poisson free energy objective becomes proportional to the prior firing rates of model neurons.<n>We find that increasing $$ monotonically increases sparsity and reduces average spiking activity in the Poisson variational autoencoder.
- Score: 5.156484100374059
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computation in biological systems is fundamentally energy-constrained, yet standard theories of computation treat energy as freely available. Here, we argue that variational free energy minimization under a Poisson assumption offers a principled path toward an energy-aware theory of computation. Our key observation is that the Kullback-Leibler (KL) divergence term in the Poisson free energy objective becomes proportional to the prior firing rates of model neurons, yielding an emergent metabolic cost term that penalizes high baseline activity. This structure couples an abstract information-theoretic quantity -- the *coding rate* -- to a concrete biophysical variable -- the *firing rate* -- which enables a trade-off between coding fidelity and energy expenditure. Such a coupling arises naturally in the Poisson variational autoencoder (P-VAE) -- a brain-inspired generative model that encodes inputs as discrete spike counts and recovers a spiking form of *sparse coding* as a special case -- but is absent from standard Gaussian VAEs. To demonstrate that this metabolic cost structure is unique to the Poisson formulation, we compare the P-VAE against Grelu-VAE, a Gaussian VAE with ReLU rectification applied to latent samples, which controls for the non-negativity constraint. Across a systematic sweep of the KL term weighting coefficient $β$ and latent dimensionality, we find that increasing $β$ monotonically increases sparsity and reduces average spiking activity in the P-VAE. In contrast, Grelu-VAE representations remain unchanged, confirming that the effect is specific to Poisson statistics rather than a byproduct of non-negative representations. These results establish Poisson variational inference as a promising foundation for a resource-constrained theory of computation.
Related papers
- Entropy-Based Dimension-Free Convergence and Loss-Adaptive Schedules for Diffusion Models [3.2091923314854416]
Diffusion generative models synthesize samples by discretizing reverse-time dynamics driven by a learned score (or denoiser)<n>We develop an information-theoretic approach to dimension-free convergence that avoids geometric assumptions.<n>We also propose a Loss-Adaptive Schedule (LAS) for efficient discretization of reverse SDE which is lightweight and relies only on the training loss.
arXiv Detail & Related papers (2026-01-29T16:28:21Z) - From Regression to Classification: Exploring the Benefits of Categorical Representations of Energy in MLIPs [1.0998907972211756]
Density Functional Theory (DFT) is a widely used computational method for estimating the energy and behavior of molecules.<n>Machine Learning Interatomic Potentials (MLIPs) are models trained to approximate DFT-level energies and forces at dramatically lower computational cost.<n>In this work, we explore a multi-class classification formulation that predicts a categorical distribution over energy/force values.
arXiv Detail & Related papers (2025-12-01T00:36:42Z) - Energy-Based Models for Predicting Mutational Effects on Proteins [42.043597166564524]
We propose a new approach to predicting changes in binding free energy ($DeltaDelta G$)<n>We novelly decompose $DeltaDelta G$ into a sequence-based component estimated by an inverse folding model and a structure-based component estimated by an energy model.<n>Our method incorporates an energy-based physical inductive bias by connecting the often-used sequence log-odds ratio-based approach to $DeltaDelta G$ prediction with a new $DeltaDelta E$ term grounded in statistical mechanics.
arXiv Detail & Related papers (2025-08-14T13:30:19Z) - Negative Binomial Variational Autoencoders for Overdispersed Latent Modeling [22.62423547669558]
Recent work makes a biologically inspired move by modeling spike counts using the Poisson distribution.<n>We introduce NegBio-VAE, a principled extension of the VAE framework that spike counts using the negative binomial distribution.<n>This shift grants explicit control over dispersion, unlocking a broader and more accurate family of neural representations.
arXiv Detail & Related papers (2025-08-07T14:15:09Z) - Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models [50.77646970127369]
We propose an energy-based diffusion model with a Fokker--Planck-derived regularization term to enforce consistency.<n>We demonstrate our approach by sampling and simulating multiple biomolecular systems, including fast-folding proteins.
arXiv Detail & Related papers (2025-06-20T16:38:29Z) - Brain-like Variational Inference [5.862480696321742]
We introduce FOND (Free energy Online Natural-gradient Dynamics), a framework that derives neural inference dynamics from three principles.<n>We apply FOND to derive iP-VAE (iterative Poisson variational autoencoder), a recurrent spiking neural network that performs variational inference through membrane potential dynamics.<n> Empirically, iP-VAE outperforms both standard VAEs and Gaussian-based predictive coding models in sparsity, reconstruction, and biological plausibility.
arXiv Detail & Related papers (2024-10-25T06:00:18Z) - Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions [48.58317905849438]
Predicting the change in binding free energy ($Delta Delta G$) is crucial for understanding and modulating protein-protein interactions.
We propose the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to $Delta Delta G$ prediction.
arXiv Detail & Related papers (2024-10-12T14:13:42Z) - Zero-inflation in the Multivariate Poisson Lognormal Family [0.5249805590164902]
We introduce the Zero-Inflated PLN model, adding a multivariate zero-inflated component to the model, as an additional Bernoulli latent variable.<n>We estimate model parameters using variational inference that scales up to datasets with a few thousands variables.<n>We then apply both ZIPLN and PLN to a cow microbiome dataset, containing 90.6% of zeroes.
arXiv Detail & Related papers (2024-05-23T15:45:21Z) - Poisson Variational Autoencoder [0.0]
Variational autoencoders (VAEs) employ Bayesian inference to interpret sensory inputs.<n>Here, we develop a novel architecture that combines principles of predictive coding with a VAE that encodes inputs into discrete spike counts.<n>Our work provides an interpretable computational framework to study brain-like sensory processing.
arXiv Detail & Related papers (2024-05-23T12:02:54Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
We present a unifying perspective on recent results on ridge regression.<n>We use the basic tools of random matrix theory and free probability, aimed at readers with backgrounds in physics and deep learning.<n>Our results extend and provide a unifying perspective on earlier models of scaling laws.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Sampling with Mollified Interaction Energy Descent [57.00583139477843]
We present a new optimization-based method for sampling called mollified interaction energy descent (MIED)
MIED minimizes a new class of energies on probability measures called mollified interaction energies (MIEs)
We show experimentally that for unconstrained sampling problems our algorithm performs on par with existing particle-based algorithms like SVGD.
arXiv Detail & Related papers (2022-10-24T16:54:18Z) - Robust PAC$^m$: Training Ensemble Models Under Misspecification and
Outliers [46.38465729190199]
PAC-Bayes theory demonstrates that the free energy criterion minimized by Bayesian learning is a bound on the generalization error for Gibbs predictors.
This work presents a novel robust free energy criterion that combines the generalized score function with PAC$m$ ensemble bounds.
arXiv Detail & Related papers (2022-03-03T17:11:07Z) - Loss function based second-order Jensen inequality and its application
to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution.
PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models.
We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z) - Targeted free energy estimation via learned mappings [66.20146549150475]
Free energy perturbation (FEP) was proposed by Zwanzig more than six decades ago as a method to estimate free energy differences.
FEP suffers from a severe limitation: the requirement of sufficient overlap between distributions.
One strategy to mitigate this problem, called Targeted Free Energy Perturbation, uses a high-dimensional mapping in configuration space to increase overlap.
arXiv Detail & Related papers (2020-02-12T11:10:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.