Sliced Iterative Normalizing Flows
- URL: http://arxiv.org/abs/2007.00674v3
- Date: Mon, 14 Jun 2021 19:11:07 GMT
- Title: Sliced Iterative Normalizing Flows
- Authors: Biwei Dai and Uros Seljak
- Abstract summary: We develop an iterative (greedy) deep learning (DL) algorithm which is able to transform an arbitrary probability distribution function (PDF) into the target PDF.
As special cases of this algorithm, we introduce two sliced iterative Normalizing Flow (SINF) models, which map from the data to the latent space (GIS) and vice versa.
- Score: 7.6146285961466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop an iterative (greedy) deep learning (DL) algorithm which is able
to transform an arbitrary probability distribution function (PDF) into the
target PDF. The model is based on iterative Optimal Transport of a series of 1D
slices, matching on each slice the marginal PDF to the target. The axes of the
orthogonal slices are chosen to maximize the PDF difference using Wasserstein
distance at each iteration, which enables the algorithm to scale well to high
dimensions. As special cases of this algorithm, we introduce two sliced
iterative Normalizing Flow (SINF) models, which map from the data to the latent
space (GIS) and vice versa (SIG). We show that SIG is able to generate high
quality samples of image datasets, which match the GAN benchmarks, while GIS
obtains competitive results on density estimation tasks compared to the density
trained NFs, and is more stable, faster, and achieves higher $p(x)$ when
trained on small training sets. SINF approach deviates significantly from the
current DL paradigm, as it is greedy and does not use concepts such as
mini-batching, stochastic gradient descent and gradient back-propagation
through deep layers.
Related papers
- Generative Modeling with Flow-Guided Density Ratio Learning [12.192867460641835]
Flow-Guided Density Ratio Learning (FDRL) is a simple and scalable approach to generative modeling.
We show that FDRL can generate images of dimensions as high as $128times128$, as well as outperform existing gradient flow baselines on quantitative benchmarks.
arXiv Detail & Related papers (2023-03-07T07:55:52Z) - Federated Learning Using Variance Reduced Stochastic Gradient for
Probabilistically Activated Agents [0.0]
This paper proposes an algorithm for Federated Learning (FL) with a two-layer structure that achieves both variance reduction and a faster convergence rate to an optimal solution in the setting where each agent has an arbitrary probability of selection in each iteration.
arXiv Detail & Related papers (2022-10-25T22:04:49Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Generative Modeling with Optimal Transport Maps [83.59805931374197]
Optimal Transport (OT) has become a powerful tool for large-scale generative modeling tasks.
We show that the OT map itself can be used as a generative model, providing comparable performance.
arXiv Detail & Related papers (2021-10-06T18:17:02Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Scalable Optimal Transport in High Dimensions for Graph Distances,
Embedding Alignment, and More [7.484063729015126]
We propose two effective log-linear time approximations of the cost matrix for optimal transport.
These approximations enable general log-linear time algorithms for entropy-regularized OT that perform well even for the complex, high-dimensional spaces.
For graph distance regression we propose the graph transport network (GTN), which combines graph neural networks (GNNs) with enhanced Sinkhorn.
arXiv Detail & Related papers (2021-07-14T17:40:08Z) - Cherry-Picking Gradients: Learning Low-Rank Embeddings of Visual Data
via Differentiable Cross-Approximation [53.95297550117153]
We propose an end-to-end trainable framework that processes large-scale visual data tensors by looking emphat a fraction of their entries only.
The proposed approach is particularly useful for large-scale multidimensional grid data, and for tasks that require context over a large receptive field.
arXiv Detail & Related papers (2021-05-29T08:39:57Z) - Why Approximate Matrix Square Root Outperforms Accurate SVD in Global
Covariance Pooling? [59.820507600960745]
We propose a new GCP meta-layer that uses SVD in the forward pass, and Pad'e Approximants in the backward propagation to compute the gradients.
The proposed meta-layer has been integrated into different CNN models and achieves state-of-the-art performances on both large-scale and fine-grained datasets.
arXiv Detail & Related papers (2021-05-06T08:03:45Z) - Message Passing Descent for Efficient Machine Learning [4.416484585765027]
We propose a new iterative optimization method for the bf Data-Fitting (DF) problem in Machine Learning.
The approach relies on bf Graphical Model representation of the DF problem.
We suggest the bf Message Passage Descent algorithm which relies on the piece-wise-polynomial representation of the model DF function.
arXiv Detail & Related papers (2021-02-16T12:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.