Universal Neural Optimal Transport
- URL: http://arxiv.org/abs/2212.00133v6
- Date: Thu, 12 Jun 2025 12:03:45 GMT
- Title: Universal Neural Optimal Transport
- Authors: Jonathan Geuter, Gregor Kornhardt, Ingimar Tomasson, Vaios Laschos,
- Abstract summary: UNOT (Universal Neural Optimal Transport) is a novel framework capable of accurately predicting (entropic) OT distances and plans between discrete measures for a given cost function.<n>We show that our network can be used as a state-of-the-art initialization for the Sinkhorn algorithm with speedups of up to $7.4times$.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimal Transport (OT) problems are a cornerstone of many applications, but solving them is computationally expensive. To address this problem, we propose UNOT (Universal Neural Optimal Transport), a novel framework capable of accurately predicting (entropic) OT distances and plans between discrete measures for a given cost function. UNOT builds on Fourier Neural Operators, a universal class of neural networks that map between function spaces and that are discretization-invariant, which enables our network to process measures of variable resolutions. The network is trained adversarially using a second, generating network and a self-supervised bootstrapping loss. We ground UNOT in an extensive theoretical framework. Through experiments on Euclidean and non-Euclidean domains, we show that our network not only accurately predicts OT distances and plans across a wide range of datasets, but also captures the geometry of the Wasserstein space correctly. Furthermore, we show that our network can be used as a state-of-the-art initialization for the Sinkhorn algorithm with speedups of up to $7.4\times$, significantly outperforming existing approaches.
Related papers
- SCoTT: Strategic Chain-of-Thought Tasking for Wireless-Aware Robot Navigation in Digital Twins [78.53885607559958]
We propose SCoTT, a wireless-aware path planning framework.<n>We show that SCoTT achieves path gains within 2% of DP-WA* while consistently generating shorter trajectories.<n>We also show the practical viability of our approach by deploying SCoTT as a ROS node within Gazebo simulations.
arXiv Detail & Related papers (2024-11-27T10:45:49Z) - GradINN: Gradient Informed Neural Network [2.287415292857564]
We propose a methodology inspired by Physics Informed Neural Networks (PINNs)
GradINNs leverage prior beliefs about a system's gradient to constrain the predicted function's gradient across all input dimensions.
We demonstrate the advantages of GradINNs, particularly in low-data regimes, on diverse problems spanning non time-dependent systems.
arXiv Detail & Related papers (2024-09-03T14:03:29Z) - LinSATNet: The Positive Linear Satisfiability Neural Networks [116.65291739666303]
This paper studies how to introduce the popular positive linear satisfiability to neural networks.
We propose the first differentiable satisfiability layer based on an extension of the classic Sinkhorn algorithm for jointly encoding multiple sets of marginal distributions.
arXiv Detail & Related papers (2024-07-18T22:05:21Z) - LaCoOT: Layer Collapse through Optimal Transport [5.869633234882029]
We present an optimal transport method to reduce the depth of over-parametrized deep neural networks.
We show that minimizing this distance enables the complete removal of intermediate layers in the network, with almost no performance loss and without requiring any finetuning.
arXiv Detail & Related papers (2024-06-13T09:03:53Z) - Decentralized Optimization in Time-Varying Networks with Arbitrary Delays [22.40154714677385]
We consider a decentralized optimization problem for networks affected by communication delays.
Examples of such networks include collaborative machine learning, sensor networks, and multi-agent systems.
To mimic communication delays, we add virtual non-computing nodes to the network, resulting in directed graphs.
arXiv Detail & Related papers (2024-05-29T20:51:38Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective.
We show how to compute this efficiently for tractable circuits.
We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z) - Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization [4.0554893636822]
We introduce a novel approach to deploy large-scale Deep Neural Networks on constrained resources.
The method speeds up inference time and aims to reduce memory demand and power consumption.
arXiv Detail & Related papers (2022-12-25T15:40:05Z) - Probabilistic Verification of ReLU Neural Networks via Characteristic
Functions [11.489187712465325]
We use ideas from probability theory in the frequency domain to provide probabilistic verification guarantees for ReLU neural networks.
We interpret a (deep) feedforward neural network as a discrete dynamical system over a finite horizon.
We obtain the corresponding cumulative distribution function of the output set, which can be used to check if the network is performing as expected.
arXiv Detail & Related papers (2022-12-03T05:53:57Z) - Entropic Neural Optimal Transport via Diffusion Processes [105.34822201378763]
We propose a novel neural algorithm for the fundamental problem of computing the entropic optimal transport (EOT) plan between continuous probability distributions.
Our algorithm is based on the saddle point reformulation of the dynamic version of EOT which is known as the Schr"odinger Bridge problem.
In contrast to the prior methods for large-scale EOT, our algorithm is end-to-end and consists of a single learning step.
arXiv Detail & Related papers (2022-11-02T14:35:13Z) - Unsupervised Optimal Power Flow Using Graph Neural Networks [172.33624307594158]
We use a graph neural network to learn a nonlinear parametrization between the power demanded and the corresponding allocation.
We show through simulations that the use of GNNs in this unsupervised learning context leads to solutions comparable to standard solvers.
arXiv Detail & Related papers (2022-10-17T17:30:09Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Adaptive Self-supervision Algorithms for Physics-informed Neural
Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function.
We study the impact of the location of the collocation points on the trainability of these models.
We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z) - Randomly Initialized One-Layer Neural Networks Make Data Linearly
Separable [1.2277343096128712]
Given sufficient width, a randomly one-layer neural network can transform two sets into two linearly separable sets without any training.
This paper contributes by establishing that, given sufficient width, a randomly one-layer neural network can transform two sets into two linearly separable sets without any training.
arXiv Detail & Related papers (2022-05-24T01:38:43Z) - PAC-Bayesian Learning of Aggregated Binary Activated Neural Networks
with Probabilities over Representations [2.047424180164312]
We study the expectation of a probabilistic neural network as a predictor by itself, focusing on the aggregation of binary activated neural networks with normal distributions over real-valued weights.
We show that the exact computation remains tractable for deep but narrow neural networks, thanks to a dynamic programming approach.
arXiv Detail & Related papers (2021-10-28T14:11:07Z) - Edge Rewiring Goes Neural: Boosting Network Resilience via Policy
Gradient [62.660451283548724]
ResiNet is a reinforcement learning framework to discover resilient network topologies against various disasters and attacks.
We show that ResiNet achieves a near-optimal resilience gain on multiple graphs while balancing the utility, with a large margin compared to existing approaches.
arXiv Detail & Related papers (2021-10-18T06:14:28Z) - Deep Neural Networks and PIDE discretizations [2.4063592468412276]
We propose neural networks that tackle the problems of stability and field-of-view of a Convolutional Neural Network (CNN)
We propose integral-based spatially nonlocal operators which are related to global weighted Laplacian, fractional Laplacian and fractional inverse Laplacian operators.
We test the effectiveness of the proposed neural architectures on benchmark image classification datasets and semantic segmentation tasks in autonomous driving.
arXiv Detail & Related papers (2021-08-05T08:03:01Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Train Once and Use Forever: Solving Boundary Value Problems in Unseen
Domains with Pre-trained Deep Learning Models [0.20999222360659606]
This paper introduces a transferable framework for solving boundary value problems (BVPs) via deep neural networks.
First, we introduce emphgenomic flow network (GFNet), a neural network that can infer the solution of a BVP across arbitrary boundary conditions.
Then, we propose emphmosaic flow (MF) predictor, a novel iterative algorithm that assembles or stitches the GFNet's inferences.
arXiv Detail & Related papers (2021-04-22T05:20:27Z) - A Convergence Theory Towards Practical Over-parameterized Deep Neural
Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time.
We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both.
Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z) - Data-Driven Random Access Optimization in Multi-Cell IoT Networks with
NOMA [78.60275748518589]
Non-orthogonal multiple access (NOMA) is a key technology to enable massive machine type communications (mMTC) in 5G networks and beyond.
In this paper, NOMA is applied to improve the random access efficiency in high-density spatially-distributed multi-cell wireless IoT networks.
A novel formulation of random channel access management is proposed, in which the transmission probability of each IoT device is tuned to maximize the geometric mean of users' expected capacity.
arXiv Detail & Related papers (2021-01-02T15:21:08Z) - Topological obstructions in neural networks learning [67.8848058842671]
We study global properties of the loss gradient function flow.
We use topological data analysis of the loss function and its Morse complex to relate local behavior along gradient trajectories with global properties of the loss surface.
arXiv Detail & Related papers (2020-12-31T18:53:25Z) - Learning to Beamform in Heterogeneous Massive MIMO Networks [48.62625893368218]
It is well-known problem of finding the optimal beamformers in massive multiple-input multiple-output (MIMO) networks.
We propose a novel deep learning based paper algorithm to address this problem.
arXiv Detail & Related papers (2020-11-08T12:48:06Z) - Training Generative Adversarial Networks via stochastic Nash games [2.995087247817663]
Generative adversarial networks (GANs) are a class of generative models with two antagonistic neural networks: a generator and a discriminator.
We show convergence to an exact solution when an increasing number of data is available.
We also show convergence of an averaged variant of the SRFB algorithm to a neighborhood of the solution when only few samples are available.
arXiv Detail & Related papers (2020-10-17T09:07:40Z) - Graph Neural Networks for Scalable Radio Resource Management:
Architecture Design and Theoretical Analysis [31.372548374969387]
We propose to apply graph neural networks (GNNs) to solve large-scale radio resource management problems.
The proposed method is highly scalable and can solve the beamforming problem in an interference channel with $1000$ transceiver pairs within $6$ milliseconds on a single GPU.
arXiv Detail & Related papers (2020-07-15T11:43:32Z) - Deep neural networks for inverse problems with pseudodifferential
operators: an application to limited-angle tomography [0.4110409960377149]
We propose a novel convolutional neural network (CNN) designed for learning pseudodifferential operators ($Psi$DOs) in the context of linear inverse problems.
We show that, under rather general assumptions on the forward operator, the unfolded iterations of ISTA can be interpreted as the successive layers of a CNN.
In particular, we prove that, in the case of LA-CT, the operations of upscaling, downscaling and convolution, can be exactly determined by combining the convolutional nature of the limited angle X-ray transform and basic properties defining a wavelet system.
arXiv Detail & Related papers (2020-06-02T14:03:41Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.