Optimal Transport for Machine Learners
- URL: http://arxiv.org/abs/2505.06589v1
- Date: Sat, 10 May 2025 10:35:03 GMT
- Title: Optimal Transport for Machine Learners
- Authors: Gabriel Peyré,
- Abstract summary: Optimal Transport is a foundational mathematical theory that connects optimization, partial differential equations, and probability.<n>These course notes cover the fundamental mathematical aspects of OT, including the Monge and Kantorovich formulations.<n> Applications in machine learning include topics like training neural networks via gradient flows, token dynamics in transformers, and the structure of GANs and diffusion models.
- Score: 23.03787751696068
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimal Transport is a foundational mathematical theory that connects optimization, partial differential equations, and probability. It offers a powerful framework for comparing probability distributions and has recently become an important tool in machine learning, especially for designing and evaluating generative models. These course notes cover the fundamental mathematical aspects of OT, including the Monge and Kantorovich formulations, Brenier's theorem, the dual and dynamic formulations, the Bures metric on Gaussian distributions, and gradient flows. It also introduces numerical methods such as linear programming, semi-discrete solvers, and entropic regularization. Applications in machine learning include topics like training neural networks via gradient flows, token dynamics in transformers, and the structure of GANs and diffusion models. These notes focus primarily on mathematical content rather than deep learning techniques.
Related papers
- MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task [49.355810887265925]
We introduce MathFimer, a novel framework for mathematical reasoning step expansion.<n>We develop a specialized model, MathFimer-7B, on our carefully curated NuminaMath-FIM dataset.<n>We then apply these models to enhance existing mathematical reasoning datasets by inserting detailed intermediate steps into their solution chains.
arXiv Detail & Related papers (2025-02-17T11:22:24Z) - Wrapped Gaussian on the manifold of Symmetric Positive Definite Matrices [6.7523635840772505]
Circular and non-flat data distributions are prevalent across diverse domains of data science.<n>A principled approach to accounting for the underlying geometry of such data is pivotal.<n>This work lays the groundwork for extending classical machine learning and statistical methods to more complex and structured data.
arXiv Detail & Related papers (2025-02-03T16:46:46Z) - Deep Generalized Schrödinger Bridges: From Image Generation to Solving Mean-Field Games [29.570545100557215]
Generalized Schr"odinger Bridges (GSBs) are a mathematical framework used to analyze the most likely particle evolution.<n>This paper focuses on an algorithmic perspective, aiming to enhance practical usage.
arXiv Detail & Related papers (2024-12-28T21:31:53Z) - Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - Multi-GPU Approach for Training of Graph ML Models on large CFD Meshes [0.0]
Mesh-based numerical solvers are an important part in many design tool chains.
Machine Learning based surrogate models are fast in predicting approximate solutions but often lack accuracy.
This paper scales a state-of-the-art surrogate model from the domain of graph-based machine learning to industry-relevant mesh sizes.
arXiv Detail & Related papers (2023-07-25T15:49:25Z) - Towards Constituting Mathematical Structures for Learning to Optimize [101.80359461134087]
A technique that utilizes machine learning to learn an optimization algorithm automatically from data has gained arising attention in recent years.
A generic L2O approach parameterizes the iterative update rule and learns the update direction as a black-box network.
While the generic approach is widely applicable, the learned model can overfit and may not generalize well to out-of-distribution test sets.
We propose a novel L2O model with a mathematics-inspired structure that is broadly applicable and generalized well to out-of-distribution problems.
arXiv Detail & Related papers (2023-05-29T19:37:28Z) - Deep Efficient Continuous Manifold Learning for Time Series Modeling [11.876985348588477]
A symmetric positive definite matrix is being studied in computer vision, signal processing, and medical image analysis.
In this paper, we propose a framework to exploit a diffeomorphism mapping between Riemannian manifold and a Cholesky space.
For dynamic modeling of time-series data, we devise a continuous manifold learning method by systematically integrating a manifold ordinary differential equation and a gated recurrent neural network.
arXiv Detail & Related papers (2021-12-03T01:38:38Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - Physics Informed Convex Artificial Neural Networks (PICANNs) for Optimal
Transport based Density Estimation [13.807546494746207]
We propose a Deep Learning approach to solve the continuous Optimal Mass Transport problem.
We focus on the ubiquitous density estimation and generative modeling tasks in statistics and machine learning.
arXiv Detail & Related papers (2021-04-02T18:44:11Z) - Learning with Density Matrices and Random Features [44.98964870180375]
A density matrix describes the statistical state of a quantum system.
It is a powerful formalism to represent both the quantum and classical uncertainty of quantum systems.
This paper explores how density matrices can be used as a building block for machine learning models.
arXiv Detail & Related papers (2021-02-08T17:54:59Z) - Mat\'ern Gaussian processes on Riemannian manifolds [81.15349473870816]
We show how to generalize the widely-used Mat'ern class of Gaussian processes.
We also extend the generalization from the Mat'ern to the widely-used squared exponential process.
arXiv Detail & Related papers (2020-06-17T21:05:42Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.