From Kinetic Theory to AI: a Rediscovery of High-Dimensional Divergences and Their Properties
- URL: http://arxiv.org/abs/2507.11387v1
- Date: Tue, 15 Jul 2025 14:56:25 GMT
- Title: From Kinetic Theory to AI: a Rediscovery of High-Dimensional Divergences and Their Properties
- Authors: Gennaro Auricchio, Giovanni Brigati, Paolo Giudici, Giuseppe Toscani,
- Abstract summary: We find the Kullback-Leibler (KL) divergence, originally introduced in kinetic theory as a measure of relative entropy between probability distributions.<n>We present a comparative review of divergence measures rooted in kinetic theory, highlighting their theoretical foundations and exploring their potential applications in machine learning and artificial intelligence.
- Score: 1.4999444543328293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Selecting an appropriate divergence measure is a critical aspect of machine learning, as it directly impacts model performance. Among the most widely used, we find the Kullback-Leibler (KL) divergence, originally introduced in kinetic theory as a measure of relative entropy between probability distributions. Just as in machine learning, the ability to quantify the proximity of probability distributions plays a central role in kinetic theory. In this paper, we present a comparative review of divergence measures rooted in kinetic theory, highlighting their theoretical foundations and exploring their potential applications in machine learning and artificial intelligence.
Related papers
- Jarzynski Reweighting and Sampling Dynamics for Training Energy-Based Models: Theoretical Analysis of Different Transition Kernels [0.0]
Energy-Based Models (EBMs) provide a flexible framework for generative modeling.<n>Traditional methods, such as contrastive divergence and score matching, introduce biases that can hinder accurate learning.<n>We present a theoretical analysis of Jarzynski reweighting, a technique from non-equilibrium statistical mechanics, and its implications for training EBMs.
arXiv Detail & Related papers (2025-06-09T15:08:01Z) - Models of Heavy-Tailed Mechanistic Universality [62.107333654304014]
We propose a family of random matrix models to explore attributes that give rise to heavy-tailed behavior in trained neural networks.<n>Under this model, spectral densities with power laws on tails arise through a combination of three independent factors.<n> Implications of our model on other appearances of heavy tails, including neural scaling laws, trajectories, and the five-plus-one phases of neural network training, are discussed.
arXiv Detail & Related papers (2025-06-04T00:55:01Z) - Review and Prospect of Algebraic Research in Equivalent Framework between Statistical Mechanics and Machine Learning Theory [0.0]
It is well known that algebraic approaches in statistical mechanics such as operator algebra enable us to analyze phase transition phenomena mathematically.<n>This paper is devoted to the memory of Professor Huzihiro Araki who is a pioneering founder of algebraic research in both statistical mechanics and quantum field theory.
arXiv Detail & Related papers (2024-05-31T11:04:13Z) - CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding [62.075029712357]
This work introduces the Cognitive Diffusion Probabilistic Models (CogDPM)
CogDPM features a precision estimation method based on the hierarchical sampling capabilities of diffusion models and weight the guidance with precision weights estimated by the inherent property of diffusion models.
We apply CogDPM to real-world prediction tasks using the United Kindom precipitation and surface wind datasets.
arXiv Detail & Related papers (2024-05-03T15:54:50Z) - Machine Learning of the Prime Distribution [49.84018914962972]
We provide a theoretical argument explaining the experimental observations of Yang-Hui He about the learnability of primes.
We also posit that the ErdHos-Kac law would very unlikely be discovered by current machine learning techniques.
arXiv Detail & Related papers (2024-03-19T09:47:54Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - A theory of learning with constrained weight-distribution [17.492950552276067]
We develop a statistical mechanical theory of learning in neural networks that incorporates structural information as constraints.
We show that training in our algorithm can be interpreted as geodesic flows in the Wasserstein space of probability distributions.
Our theory and algorithm provide novel strategies for incorporating prior knowledge about weights into learning, and reveal a powerful connection between structure and function in neural networks.
arXiv Detail & Related papers (2022-06-14T00:43:34Z) - Discovering Latent Causal Variables via Mechanism Sparsity: A New
Principle for Nonlinear ICA [81.4991350761909]
Independent component analysis (ICA) refers to an ensemble of methods which formalize this goal and provide estimation procedure for practical application.
We show that the latent variables can be recovered up to a permutation if one regularizes the latent mechanisms to be sparse.
arXiv Detail & Related papers (2021-07-21T14:22:14Z) - Machine Learning S-Wave Scattering Phase Shifts Bypassing the Radial
Schr\"odinger Equation [77.34726150561087]
We present a proof of concept machine learning model resting on a convolutional neural network capable to yield accurate scattering s-wave phase shifts.
We discuss how the Hamiltonian can serve as a guiding principle in the construction of a physically-motivated descriptor.
arXiv Detail & Related papers (2021-06-25T17:25:38Z) - Quantum field-theoretic machine learning [0.0]
We recast the $phi4$ scalar field theory as a machine learning algorithm within the mathematically rigorous framework of Markov random fields.
Neural networks are additionally derived from the $phi4$ theory which can be viewed as generalizations of conventional neural networks.
arXiv Detail & Related papers (2021-02-18T16:12:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.