Generative Adversarial Learning of Sinkhorn Algorithm Initializations
- URL: http://arxiv.org/abs/2212.00133v4
- Date: Fri, 2 Feb 2024 00:00:02 GMT
- Title: Generative Adversarial Learning of Sinkhorn Algorithm Initializations
- Authors: Jonathan Geuter, Vaios Laschos
- Abstract summary: We show that meticulously training a neural network to learn initializations to the algorithm via the entropic OT dual problem can significantly speed up convergence.
We show that our network can even be used as a standalone OT solver to approximate regularized transport distances to a few percent error.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Sinkhorn algorithm is the state-of-the-art to approximate solutions of
entropic optimal transport (OT) distances between discrete probability
distributions. We show that meticulously training a neural network to learn
initializations to the algorithm via the entropic OT dual problem can
significantly speed up convergence, while maintaining desirable properties of
the Sinkhorn algorithm, such as differentiability and parallelizability. We
train our predictive network in an adversarial fashion using a second,
generating network and a self-supervised bootstrapping loss. The predictive
network is universal in the sense that it is able to generalize to any pair of
distributions of fixed dimension and cost at inference, and we prove that we
can make the generating network universal in the sense that it is capable of
producing any pair of distributions during training. Furthermore, we show that
our network can even be used as a standalone OT solver to approximate
regularized transport distances to a few percent error, which makes it the
first meta neural OT solver.
Related papers
- LinSATNet: The Positive Linear Satisfiability Neural Networks [116.65291739666303]
This paper studies how to introduce the popular positive linear satisfiability to neural networks.
We propose the first differentiable satisfiability layer based on an extension of the classic Sinkhorn algorithm for jointly encoding multiple sets of marginal distributions.
arXiv Detail & Related papers (2024-07-18T22:05:21Z) - Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective.
We show how to compute this efficiently for tractable circuits.
We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z) - Probabilistic Verification of ReLU Neural Networks via Characteristic
Functions [11.489187712465325]
We use ideas from probability theory in the frequency domain to provide probabilistic verification guarantees for ReLU neural networks.
We interpret a (deep) feedforward neural network as a discrete dynamical system over a finite horizon.
We obtain the corresponding cumulative distribution function of the output set, which can be used to check if the network is performing as expected.
arXiv Detail & Related papers (2022-12-03T05:53:57Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Randomly Initialized One-Layer Neural Networks Make Data Linearly
Separable [1.2277343096128712]
Given sufficient width, a randomly one-layer neural network can transform two sets into two linearly separable sets without any training.
This paper contributes by establishing that, given sufficient width, a randomly one-layer neural network can transform two sets into two linearly separable sets without any training.
arXiv Detail & Related papers (2022-05-24T01:38:43Z) - PAC-Bayesian Learning of Aggregated Binary Activated Neural Networks
with Probabilities over Representations [2.047424180164312]
We study the expectation of a probabilistic neural network as a predictor by itself, focusing on the aggregation of binary activated neural networks with normal distributions over real-valued weights.
We show that the exact computation remains tractable for deep but narrow neural networks, thanks to a dynamic programming approach.
arXiv Detail & Related papers (2021-10-28T14:11:07Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - A Convergence Theory Towards Practical Over-parameterized Deep Neural
Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time.
We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both.
Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z) - Training Generative Adversarial Networks via stochastic Nash games [2.995087247817663]
Generative adversarial networks (GANs) are a class of generative models with two antagonistic neural networks: a generator and a discriminator.
We show convergence to an exact solution when an increasing number of data is available.
We also show convergence of an averaged variant of the SRFB algorithm to a neighborhood of the solution when only few samples are available.
arXiv Detail & Related papers (2020-10-17T09:07:40Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.