Representational aspects of depth and conditioning in normalizing flows
- URL: http://arxiv.org/abs/2010.01155v2
- Date: Fri, 25 Jun 2021 23:48:35 GMT
- Title: Representational aspects of depth and conditioning in normalizing flows
- Authors: Frederic Koehler, Viraj Mehta, Andrej Risteski
- Abstract summary: We show that representationally the choice of partition is not a bottleneck for depth.
We also show that shallow affine coupling networks are universal approximators in Wasserstein distance.
- Score: 33.4333537858003
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Normalizing flows are among the most popular paradigms in generative
modeling, especially for images, primarily because we can efficiently evaluate
the likelihood of a data point. This is desirable both for evaluating the fit
of a model, and for ease of training, as maximizing the likelihood can be done
by gradient descent. However, training normalizing flows comes with
difficulties as well: models which produce good samples typically need to be
extremely deep -- which comes with accompanying vanishing/exploding gradient
problems. A very related problem is that they are often poorly conditioned:
since they are parametrized as invertible maps from $\mathbb{R}^d \to
\mathbb{R}^d$, and typical training data like images intuitively is
lower-dimensional, the learned maps often have Jacobians that are close to
being singular.
In our paper, we tackle representational aspects around depth and
conditioning of normalizing flows: both for general invertible architectures,
and for a particular common architecture, affine couplings. We prove that
$\Theta(1)$ affine coupling layers suffice to exactly represent a permutation
or $1 \times 1$ convolution, as used in GLOW, showing that representationally
the choice of partition is not a bottleneck for depth. We also show that
shallow affine coupling networks are universal approximators in Wasserstein
distance if ill-conditioning is allowed, and experimentally investigate related
phenomena involving padding. Finally, we show a depth lower bound for general
flow architectures with few neurons per layer and bounded Lipschitz constant.
Related papers
- Scale Propagation Network for Generalizable Depth Completion [16.733495588009184]
We propose a novel scale propagation normalization (SP-Norm) method to propagate scales from input to output.
We also develop a new network architecture based on SP-Norm and the ConvNeXt V2 backbone.
Our model consistently achieves the best accuracy with faster speed and lower memory when compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-10-24T03:53:06Z) - NeuralGF: Unsupervised Point Normal Estimation by Learning Neural
Gradient Function [55.86697795177619]
Normal estimation for 3D point clouds is a fundamental task in 3D geometry processing.
We introduce a new paradigm for learning neural gradient functions, which encourages the neural network to fit the input point clouds.
Our excellent results on widely used benchmarks demonstrate that our method can learn more accurate normals for both unoriented and oriented normal estimation tasks.
arXiv Detail & Related papers (2023-11-01T09:25:29Z) - Neural Gradient Learning and Optimization for Oriented Point Normal
Estimation [53.611206368815125]
We propose a deep learning approach to learn gradient vectors with consistent orientation from 3D point clouds for normal estimation.
We learn an angular distance field based on local plane geometry to refine the coarse gradient vectors.
Our method efficiently conducts global gradient approximation while achieving better accuracy and ability generalization of local feature description.
arXiv Detail & Related papers (2023-09-17T08:35:11Z) - Path Regularization: A Convexity and Sparsity Inducing Regularization
for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape.
We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z) - Universal Approximation for Log-concave Distributions using
Well-conditioned Normalizing Flows [20.022920482589324]
We show that any log-concave distribution can be approximated using well-conditioned affine-coupling flows.
Our results also inform the practice of training affine couplings.
arXiv Detail & Related papers (2021-07-07T00:13:50Z) - DiGS : Divergence guided shape implicit neural representation for
unoriented point clouds [36.60407995156801]
Shape implicit neural representations (INRs) have recently shown to be effective in shape analysis and reconstruction tasks.
We propose a divergence guided shape representation learning approach that does not require normal vectors as input.
arXiv Detail & Related papers (2021-06-21T02:10:03Z) - Learning Optical Flow from a Few Matches [67.83633948984954]
We show that the dense correlation volume representation is redundant and accurate flow estimation can be achieved with only a fraction of elements in it.
Experiments show that our method can reduce computational cost and memory use significantly, while maintaining high accuracy.
arXiv Detail & Related papers (2021-04-05T21:44:00Z) - Self Normalizing Flows [65.73510214694987]
We propose a flexible framework for training normalizing flows by replacing expensive terms in the gradient by learned approximate inverses at each layer.
This reduces the computational complexity of each layer's exact update from $mathcalO(D3)$ to $mathcalO(D2)$.
We show experimentally that such models are remarkably stable and optimize to similar data likelihood values as their exact gradient counterparts.
arXiv Detail & Related papers (2020-11-14T09:51:51Z) - End-to-end Interpretable Learning of Non-blind Image Deblurring [102.75982704671029]
Non-blind image deblurring is typically formulated as a linear least-squares problem regularized by natural priors on the corresponding sharp picture's gradients.
We propose to precondition the Richardson solver using approximate inverse filters of the (known) blur and natural image prior kernels.
arXiv Detail & Related papers (2020-07-03T15:45:01Z) - Neural Ordinary Differential Equations on Manifolds [0.342658286826597]
Recently normalizing flows in Euclidean space based on Neural ODEs show great promise, yet suffer the same limitations.
We show how vector fields provide a general framework for parameterizing a flexible class of invertible mapping on these spaces.
arXiv Detail & Related papers (2020-06-11T17:56:34Z) - You say Normalizing Flows I see Bayesian Networks [11.23030807455021]
We show that normalizing flows reduce to Bayesian networks with a pre-defined topology and a learnable density at each node.
We show that stacking multiple transformations in a normalizing flow relaxes independence assumptions and entangles the model distribution.
We prove the non-universality of the affine normalizing flow, regardless of its depth.
arXiv Detail & Related papers (2020-06-01T11:54:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.