Approximation Theory, Computing, and Deep Learning on the Wasserstein Space
- URL: http://arxiv.org/abs/2310.19548v4
- Date: Thu, 10 Oct 2024 13:30:32 GMT
- Title: Approximation Theory, Computing, and Deep Learning on the Wasserstein Space
- Authors: Massimo Fornasier, Pascal Heid, Giacomo Enrico Sodini,
- Abstract summary: We address the challenge of approximating functions in infinite-dimensional spaces from finite samples.
Our focus is on the Wasserstein distance function, which serves as a relevant example.
We adopt three machine learning-based approaches to define functional approximants.
- Score: 0.5735035463793009
- License:
- Abstract: The challenge of approximating functions in infinite-dimensional spaces from finite samples is widely regarded as formidable. We delve into the challenging problem of the numerical approximation of Sobolev-smooth functions defined on probability spaces. Our particular focus centers on the Wasserstein distance function, which serves as a relevant example. In contrast to the existing body of literature focused on approximating efficiently pointwise evaluations, we chart a new course to define functional approximants by adopting three machine learning-based approaches: 1. Solving a finite number of optimal transport problems and computing the corresponding Wasserstein potentials. 2. Employing empirical risk minimization with Tikhonov regularization in Wasserstein Sobolev spaces. 3. Addressing the problem through the saddle point formulation that characterizes the weak form of the Tikhonov functional's Euler-Lagrange equation. We furnish explicit and quantitative bounds on generalization errors for each of these solutions. We leverage the theory of metric Sobolev spaces and we combine it with techniques of optimal transport, variational calculus, and large deviation bounds. In our numerical implementation, we harness appropriately designed neural networks to serve as basis functions. These networks undergo training using diverse methodologies. This approach allows us to obtain approximating functions that can be rapidly evaluated after training. Our constructive solutions significantly enhance at equal accuracy the evaluation speed, surpassing that of state-of-the-art methods by several orders of magnitude. This allows evaluations over large datasets several times faster, including training, than traditional optimal transport algorithms. Our analytically designed deep learning architecture slightly outperforms the test error of state-of-the-art CNN architectures on datasets of images.
Related papers
- Finite Operator Learning: Bridging Neural Operators and Numerical Methods for Efficient Parametric Solution and Optimization of PDEs [0.0]
We introduce a method that combines neural operators, physics-informed machine learning, and standard numerical methods for solving PDEs.
We can parametrically solve partial differential equations in a data-free manner and provide accurate sensitivities.
Our study focuses on the steady-state heat equation within heterogeneous materials.
arXiv Detail & Related papers (2024-07-04T21:23:12Z) - A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - Spectral operator learning for parametric PDEs without data reliance [6.7083321695379885]
We introduce a novel operator learning-based approach for solving parametric partial differential equations (PDEs) without the need for data harnessing.
The proposed framework demonstrates superior performance compared to existing scientific machine learning techniques.
arXiv Detail & Related papers (2023-10-03T12:37:15Z) - Efficient Graph Field Integrators Meet Point Clouds [59.27295475120132]
We present two new classes of algorithms for efficient field integration on graphs encoding point clouds.
The first class, SeparatorFactorization(SF), leverages the bounded genus of point cloud mesh graphs, while the second class, RFDiffusion(RFD), uses popular epsilon-nearest-neighbor graph representations for point clouds.
arXiv Detail & Related papers (2023-02-02T08:33:36Z) - Mean-field neural networks: learning mappings on Wasserstein space [0.0]
We study the machine learning task for models with operators mapping between the Wasserstein space of probability measures and a space of functions.
Two classes of neural networks are proposed to learn so-called mean-field functions.
We present different algorithms relying on mean-field neural networks for solving time-dependent mean-field problems.
arXiv Detail & Related papers (2022-10-27T05:11:42Z) - DeepPhysics: a physics aware deep learning framework for real-time
simulation [0.0]
We propose a solution to simulate hyper-elastic materials using a data-driven approach.
A neural network is trained to learn the non-linear relationship between boundary conditions and the resulting displacement field.
The results show that our network architecture trained with a limited amount of data can predict the displacement field in less than a millisecond.
arXiv Detail & Related papers (2021-09-17T12:15:47Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Computational characteristics of feedforward neural networks for solving
a stiff differential equation [0.0]
We study the solution of a simple but fundamental stiff ordinary differential equation modelling a damped system.
We show that it is possible to identify preferable choices to be made for parameters and methods.
Overall we extend the current literature in the field by showing what can be done in order to obtain reliable and accurate results by the neural network approach.
arXiv Detail & Related papers (2020-12-03T12:22:24Z) - Deep Magnification-Flexible Upsampling over 3D Point Clouds [103.09504572409449]
We propose a novel end-to-end learning-based framework to generate dense point clouds.
We first formulate the problem explicitly, which boils down to determining the weights and high-order approximation errors.
Then, we design a lightweight neural network to adaptively learn unified and sorted weights as well as the high-order refinements.
arXiv Detail & Related papers (2020-11-25T14:00:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.