Kolmogorov-Arnold Energy Models: Fast and Interpretable Generative Modeling
- URL: http://arxiv.org/abs/2506.14167v6
- Date: Wed, 29 Oct 2025 01:00:08 GMT
- Title: Kolmogorov-Arnold Energy Models: Fast and Interpretable Generative Modeling
- Authors: Prithvi Raj,
- Abstract summary: We introduce the Kolmogorov-Arnold Energy Model (KAEM) to take advantage of structural and inductive biases.<n> KAEM balances common generative modeling trade-offs, offering fast inference, interpretability, and stable training, while being naturally suited to Zettascale Computing hardware.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning an energy-based model (EBM) in the latent space of a top-down generative model offers a powerful framework for generation across many data modalities. However, it remains unclear how its interpretability can be used to guide model design, improve generative quality, and reduce training time. Moreover, the reliance on Langevin Monte Carlo (LMC) sampling presents challenges in efficiency and sampling multimodal latent distributions. We propose a novel adaptation of the Kolmogorov-Arnold representation theorem for generative modeling and introduce the Kolmogorov-Arnold Energy Model (KAEM) to take advantage of structural and inductive biases. By constraining the prior to univariate relationships, KAEM enables fast and exact inference via the inverse transform method. With the low dimensionality of the latent space and suitable inductive biases encoded, we demonstrate that importance sampling (IS) becomes a viable, unbiased, and highly efficient posterior sampler. For domains where IS fails, we introduce a strategy based on population-based LMC, decomposing the posterior into a sequence of annealed distributions to improve LMC mixing. KAEM balances common generative modeling trade-offs, offering fast inference, interpretability, and stable training, while being naturally suited to Zettascale Computing hardware.
Related papers
- Simulated Annealing Enhances Theory-of-Mind Reasoning in Autoregressive Language Models [1.4323566945483497]
Theory of Mind (ToM) tasks crucially depend on reasoning about latent mental states of oneself and others.<n>We show that strong ToM capability can be recovered directly from the base model without any additional weight updates or verifications.
arXiv Detail & Related papers (2026-01-18T05:51:30Z) - Equilibrium Matching: Generative Modeling with Implicit Energy-Based Models [52.74448905289362]
EqM is a generative modeling framework built from an equilibrium dynamics perspective.<n>By replacing time-conditional velocities with a unified equilibrium landscape, EqM offers a tighter bridge between flow and energy-based models.
arXiv Detail & Related papers (2025-10-02T17:59:06Z) - Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.<n>Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - Kolmogorov-Arnold Networks for Time Series: Bridging Predictive Power and Interpretability [6.4314326272535896]
Kolmogorov-Arnold Networks (KAN) is a groundbreaking model recently proposed by the MIT team.
KAN is designed to detect concept drift within time series and can explain the nonlinear relationships between predictions and previous time steps.
T-KAN is designed to detect concept drift within time series and can explain the nonlinear relationships between predictions and previous time steps.
MT-KAN, on the other hand, improves predictive performance by effectively uncovering and leveraging the complex relationships among variables.
arXiv Detail & Related papers (2024-06-04T17:14:31Z) - MGE: A Training-Free and Efficient Model Generation and Enhancement
Scheme [10.48591131837771]
This paper proposes a Training-Free and Efficient Model Generation and Enhancement Scheme (MGE)
It considers two aspects during the model generation process: the distribution of model parameters and model performance.
Experiments result shows that generated models are comparable to models obtained through normal training, and even superior in some cases.
arXiv Detail & Related papers (2024-02-27T13:12:00Z) - Generalized Contrastive Divergence: Joint Training of Energy-Based Model
and Diffusion Model through Inverse Reinforcement Learning [13.22531381403974]
Generalized Contrastive Divergence (GCD) is a novel objective function for training an energy-based model (EBM) and a sampler simultaneously.
We present preliminary yet promising results showing that joint training is beneficial for both EBM and a diffusion model.
arXiv Detail & Related papers (2023-12-06T10:10:21Z) - STANLEY: Stochastic Gradient Anisotropic Langevin Dynamics for Learning
Energy-Based Models [41.031470884141775]
We present an end-to-end learning algorithm for Energy-Based models (EBM)
We propose in this paper, a novel high dimensional sampling method, based on an anisotropic stepsize and a gradient-informed covariance matrix.
Our resulting method, namely STANLEY, is an optimization algorithm for training Energy-Based models via our newly introduced MCMC method.
arXiv Detail & Related papers (2023-10-19T11:55:16Z) - Learning Energy-Based Prior Model with Diffusion-Amortized MCMC [89.95629196907082]
Common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress.
We introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it.
arXiv Detail & Related papers (2023-10-05T00:23:34Z) - Learning Energy-Based Models by Cooperative Diffusion Recovery Likelihood [64.95663299945171]
Training energy-based models (EBMs) on high-dimensional data can be both challenging and time-consuming.
There exists a noticeable gap in sample quality between EBMs and other generative frameworks like GANs and diffusion models.
We propose cooperative diffusion recovery likelihood (CDRL), an effective approach to tractably learn and sample from a series of EBMs.
arXiv Detail & Related papers (2023-09-10T22:05:24Z) - Adversarial Training Improves Joint Energy-Based Generative Modelling [1.0878040851638]
We propose the novel framework for generative modelling using hybrid energy-based models.
In our method we combine the interpretable input gradients of the robust classifier and Langevin Dynamics for sampling.
arXiv Detail & Related papers (2022-07-18T21:30:03Z) - Your Autoregressive Generative Model Can be Better If You Treat It as an
Energy-Based One [83.5162421521224]
We propose a unique method termed E-ARM for training autoregressive generative models.
E-ARM takes advantage of a well-designed energy-based learning objective.
We show that E-ARM can be trained efficiently and is capable of alleviating the exposure bias problem.
arXiv Detail & Related papers (2022-06-26T10:58:41Z) - Latent Diffusion Energy-Based Model for Interpretable Text Modeling [104.85356157724372]
We introduce a novel symbiosis between the diffusion models and latent space EBMs in a variational learning framework.
We develop a geometric clustering-based regularization jointly with the information bottleneck to further improve the quality of the learned latent space.
arXiv Detail & Related papers (2022-06-13T03:41:31Z) - An Energy-Based Prior for Generative Saliency [62.79775297611203]
We propose a novel generative saliency prediction framework that adopts an informative energy-based model as a prior distribution.
With the generative saliency model, we can obtain a pixel-wise uncertainty map from an image, indicating model confidence in the saliency prediction.
Experimental results show that our generative saliency model with an energy-based prior can achieve not only accurate saliency predictions but also reliable uncertainty maps consistent with human perception.
arXiv Detail & Related papers (2022-04-19T10:51:00Z) - A Unified Contrastive Energy-based Model for Understanding the
Generative Ability of Adversarial Training [64.71254710803368]
Adversarial Training (AT) is an effective approach to enhance the robustness of deep neural networks.
We demystify this phenomenon by developing a unified probabilistic framework, called Contrastive Energy-based Models (CEM)
We propose a principled method to develop adversarial learning and sampling methods.
arXiv Detail & Related papers (2022-03-25T05:33:34Z) - Particle Dynamics for Learning EBMs [83.59335980576637]
Energy-based modeling is a promising approach to unsupervised learning, which yields many downstream applications from a single model.
The main difficulty in learning energy-based models with the "contrastive approaches" is the generation of samples from the current energy function at each iteration.
This paper proposes an alternative approach to getting these samples and avoiding crude MCMC sampling from the current model.
arXiv Detail & Related papers (2021-11-26T23:41:07Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Prediction-Centric Learning of Independent Cascade Dynamics from Partial
Observations [13.680949377743392]
We address the problem of learning of a spreading model such that the predictions generated from this model are accurate.
We introduce a computationally efficient algorithm, based on a scalable dynamic message-passing approach.
We show that tractable inference from the learned model generates a better prediction of marginal probabilities compared to the original model.
arXiv Detail & Related papers (2020-07-13T17:58:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.