Approximation Theory, Computing, and Deep Learning on the Wasserstein Space
- URL: http://arxiv.org/abs/2310.19548v3
- Date: Tue, 30 Apr 2024 13:31:33 GMT
- Title: Approximation Theory, Computing, and Deep Learning on the Wasserstein Space
- Authors: Massimo Fornasier, Pascal Heid, Giacomo Enrico Sodini,
- Abstract summary: We study the challenge of approximating functions in infinite-dimensional spaces from finite samples.
Our particular focus centers on the Wasserstein distance function, which serves as a relevant example.
In our numerical implementation, we harness appropriately designed neural networks to serve as basis functions.
- Score: 0.5735035463793009
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The challenge of approximating functions in infinite-dimensional spaces from finite samples is widely regarded as formidable. In this study, we delve into the challenging problem of the numerical approximation of Sobolev-smooth functions defined on probability spaces. Our particular focus centers on the Wasserstein distance function, which serves as a relevant example. In contrast to the existing body of literature focused on approximating efficiently pointwise evaluations, we chart a new course to define functional approximants by adopting three machine learning-based approaches: 1. Solving a finite number of optimal transport problems and computing the corresponding Wasserstein potentials. 2. Employing empirical risk minimization with Tikhonov regularization in Wasserstein Sobolev spaces. 3. Addressing the problem through the saddle point formulation that characterizes the weak form of the Tikhonov functional's Euler-Lagrange equation. As a theoretical contribution, we furnish explicit and quantitative bounds on generalization errors for each of these solutions. In the proofs, we leverage the theory of metric Sobolev spaces and we combine it with techniques of optimal transport, variational calculus, and large deviation bounds. In our numerical implementation, we harness appropriately designed neural networks to serve as basis functions. These networks undergo training using diverse methodologies. This approach allows us to obtain approximating functions that can be rapidly evaluated after training. Consequently, our constructive solutions significantly enhance at equal accuracy the evaluation speed, surpassing that of state-of-the-art methods by several orders of magnitude.
Related papers
- Finite Operator Learning: Bridging Neural Operators and Numerical Methods for Efficient Parametric Solution and Optimization of PDEs [0.0]
We introduce a method that combines neural operators, physics-informed machine learning, and standard numerical methods for solving PDEs.
We can parametrically solve partial differential equations in a data-free manner and provide accurate sensitivities.
Our study focuses on the steady-state heat equation within heterogeneous materials.
arXiv Detail & Related papers (2024-07-04T21:23:12Z) - A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimiax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - Spectral operator learning for parametric PDEs without data reliance [6.7083321695379885]
We introduce a novel operator learning-based approach for solving parametric partial differential equations (PDEs) without the need for data harnessing.
The proposed framework demonstrates superior performance compared to existing scientific machine learning techniques.
arXiv Detail & Related papers (2023-10-03T12:37:15Z) - Learning from Integral Losses in Physics Informed Neural Networks [7.4308941970763795]
This work proposes a solution for the problem of training physics-informed networks under partial integro-differential equations.
We show that naive approximations at replacing these integrals with unbiased estimates lead to biased loss functions and solutions.
Our numerical results confirm the existence of the aforementioned bias in practice and also show that our proposed delayed target approach can lead to accurate solutions with comparable quality to ones estimated with a large sample size integral.
arXiv Detail & Related papers (2023-05-27T06:46:08Z) - Monte Carlo Neural PDE Solver for Learning PDEs via Probabilistic Representation [59.45669299295436]
We propose a Monte Carlo PDE solver for training unsupervised neural solvers.
We use the PDEs' probabilistic representation, which regards macroscopic phenomena as ensembles of random particles.
Our experiments on convection-diffusion, Allen-Cahn, and Navier-Stokes equations demonstrate significant improvements in accuracy and efficiency.
arXiv Detail & Related papers (2023-02-10T08:05:19Z) - Mean-field neural networks: learning mappings on Wasserstein space [0.0]
We study the machine learning task for models with operators mapping between the Wasserstein space of probability measures and a space of functions.
Two classes of neural networks are proposed to learn so-called mean-field functions.
We present different algorithms relying on mean-field neural networks for solving time-dependent mean-field problems.
arXiv Detail & Related papers (2022-10-27T05:11:42Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.