Related papers: Optimal and Diffusion Transports in Machine Learning

Optimal and Diffusion Transports in Machine Learning

URL: http://arxiv.org/abs/2512.06797v1
Date: Sun, 07 Dec 2025 11:25:32 GMT
Title: Optimal and Diffusion Transports in Machine Learning
Authors: Gabriel Peyré,
Abstract summary: Problems in machine learning are naturally expressed as the design and analysis of time-evolving probability distributions.<n>This survey presents an overview of two complementary approaches: diffusion methods and optimal transport.<n>We illustrate how both approaches appear in applications ranging from sampling, neural network optimization, to modeling the dynamics of transformers for large language models.
Score: 21.689846521201588
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Several problems in machine learning are naturally expressed as the design and analysis of time-evolving probability distributions. This includes sampling via diffusion methods, optimizing the weights of neural networks, and analyzing the evolution of token distributions across layers of large language models. While the targeted applications differ (samples, weights, tokens), their mathematical descriptions share a common structure. A key idea is to switch from the Eulerian representation of densities to their Lagrangian counterpart through vector fields that advect particles. This dual view introduces challenges, notably the non-uniqueness of Lagrangian vector fields, but also opportunities to craft density evolutions and flows with favorable properties in terms of regularity, stability, and computational tractability. This survey presents an overview of these methods, with emphasis on two complementary approaches: diffusion methods, which rely on stochastic interpolation processes and underpin modern generative AI, and optimal transport, which defines interpolation by minimizing displacement cost. We illustrate how both approaches appear in applications ranging from sampling, neural network optimization, to modeling the dynamics of transformers for large language models.

Related papers

Diffusion models for multivariate subsurface generation and efficient probabilistic inversion [0.0]
Diffusion models offer stable training and state-of-the-art performance for deep generative modeling tasks.<n>We introduce a likelihood approximation accounting for the noise-contamination that is inherent in diffusion modeling.<n>Our tests show significantly improved statistical robustness, enhanced sampling of the posterior probability density function.
arXiv Detail & Related papers (2025-07-21T17:10:16Z)
Generative AI Models for Learning Flow Maps of Stochastic Dynamical Systems in Bounded Domains [7.325529913721375]
Simulating differential equations (SDEs) in bounded domains requires accurate modeling of interior dynamics and boundary interactions.<n>Existing learning methods are not applicable to SDEs in bounded domains because they cannot accurately capture the particle exit dynamics.<n>We present a unified hybrid data-driven approach that combines a conditional diffusion model with an exit prediction neural network to capture both interior dynamics and boundary exit phenomena.
arXiv Detail & Related papers (2025-07-17T13:27:49Z)
Evolvable Conditional Diffusion [22.614995975820094]
Black-box, non-differentiable multi-physics models can be effectively used for guiding the generative process.<n>We derive an evolution-guided approach from first principles through the lens of probabilistic evolution.<n>We validate our proposed evolvable diffusion algorithm in two AI for Science scenarios.
arXiv Detail & Related papers (2025-06-16T07:11:32Z)
Flow-based generative models as iterative algorithms in probability space [20.890922389987676]
Flow-based generative models offer exact likelihood estimation, efficient sampling, and deterministic transformations.<n>This tutorial presents an intuitive mathematical framework for flow-based generative models.<n>We aim to equip researchers and practitioners with the necessary tools to effectively apply flow-based generative models in signal processing and machine learning.
arXiv Detail & Related papers (2025-02-19T03:09:18Z)
Dynamical Measure Transport and Neural PDE Solvers for Sampling [77.38204731939273]
We tackle the task of sampling from a probability density as transporting a tractable density function to the target. We employ physics-informed neural networks (PINNs) to approximate the respective partial differential equations (PDEs) solutions. PINNs allow for simulation- and discretization-free optimization and can be trained very efficiently.
arXiv Detail & Related papers (2024-07-10T17:39:50Z)
On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution. In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z)
Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
Neural Field Dynamics Model for Granular Object Piles Manipulation [12.452569633458037]
We present a learning-based dynamics model for granular material manipulation. Inspired by the Eulerian approach commonly used in fluid dynamics, our method adopts a fully convolutional neural network.
arXiv Detail & Related papers (2023-11-01T19:36:56Z)
Learning minimal representations of stochastic processes with variational autoencoders [52.99137594502433]
We introduce an unsupervised machine learning approach to determine the minimal set of parameters required to describe a process. Our approach enables for the autonomous discovery of unknown parameters describing processes.
arXiv Detail & Related papers (2023-07-21T14:25:06Z)
A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE. We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z)
Focus of Attention Improves Information Transfer in Visual Features [80.22965663534556]
This paper focuses on unsupervised learning for transferring visual information in a truly online setting. The computation of the entropy terms is carried out by a temporal process which yields online estimation of the entropy terms. In order to better structure the input probability distribution, we use a human-like focus of attention model.
arXiv Detail & Related papers (2020-06-16T15:07:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.