Related papers: Bounded KRnet and its applications to density estimation and approximation

Bounded KRnet and its applications to density estimation and approximation

URL: http://arxiv.org/abs/2305.09063v3
Date: Wed, 23 Oct 2024 15:28:59 GMT
Title: Bounded KRnet and its applications to density estimation and approximation
Authors: Li Zeng, Xiaoliang Wan, Tao Zhou,
Abstract summary: In this paper, we develop an invertible mapping, called B-KRnet, on a bounded domain. We apply it to density estimation/approximation for data or the solutions of PDEs such as the Fokker-Planck equation and the Keller-Segel equation.
Score: 7.834363165328673
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we develop an invertible mapping, called B-KRnet, on a bounded domain and apply it to density estimation/approximation for data or the solutions of PDEs such as the Fokker-Planck equation and the Keller-Segel equation. Similar to KRnet, the structure of B-KRnet adapts the pseudo-triangular structure into a normalizing flow model. The main difference between B-KRnet and KRnet is that B-KRnet is defined on a hypercube while KRnet is defined on the whole space, in other words, a new mechanism is introduced in B-KRnet to maintain the exact invertibility. Using B-KRnet as a transport map, we obtain an explicit probability density function (PDF) model that corresponds to the pushforward of a prior (uniform) distribution on the hypercube. It can be directly applied to density estimation when only data are available. By coupling KRnet and B-KRnet, we define a deep generative model on a high-dimensional domain where some dimensions are bounded and other dimensions are unbounded. A typical case is the solution of the stationary kinetic Fokker-Planck equation, which is a PDF of position and momentum. Based on B-KRnet, we develop an adaptive learning approach to approximate partial differential equations whose solutions are PDFs or can be treated as PDFs. A variety of numerical experiments is presented to demonstrate the effectiveness of B-KRnet.

Related papers

Proper Latent Decomposition [4.266376725904727]
We compute a reduced set of intrinsic coordinates (latent space) to accurately describe a flow with fewer degrees of freedom than the numerical discretization. With this proposed numerical framework, we propose an algorithm to perform PLD on the manifold. This work opens opportunities for analyzing autoencoders and latent spaces, nonlinear reduced-order modeling and scientific insights into the structure of high-dimensional data.
arXiv Detail & Related papers (2024-12-01T12:19:08Z)
Latent Schrodinger Bridge: Prompting Latent Diffusion for Fast Unpaired Image-to-Image Translation [58.19676004192321]
Diffusion models (DMs), which enable both image generation from noise and inversion from data, have inspired powerful unpaired image-to-image (I2I) translation algorithms. We tackle this problem with Schrodinger Bridges (SBs), which are differential equations (SDEs) between distributions with minimal transport cost. Inspired by this observation, we propose Latent Schrodinger Bridges (LSBs) that approximate the SB ODE via pre-trained Stable Diffusion. We demonstrate that our algorithm successfully conduct competitive I2I translation in unsupervised setting with only a fraction of cost required by previous DM-
arXiv Detail & Related papers (2024-11-22T11:24:14Z)
Adaptive deep density approximation for stochastic dynamical systems [0.5120567378386615]
A new temporal KRnet is proposed to approximate the probability density functions (PDFs) iteration of the state variables. To efficiently train the tKRnet, an adaptive procedure is developed to generate collocation points for the corresponding residual loss function. A temporal decomposition technique is also employed to improve the long-time integration.
arXiv Detail & Related papers (2024-05-05T04:29:22Z)
The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks [53.95175206863992]
We study the type of solutions to which gradient descent converges when used to train a single hidden-layer multivariate ReLU network with the quadratic loss. We prove that although shallow ReLU networks are universal approximators, stable shallow networks are not.
arXiv Detail & Related papers (2023-06-30T09:17:39Z)
Machine learning in and out of equilibrium [58.88325379746631]
Our study uses a Fokker-Planck approach, adapted from statistical physics, to explore these parallels. We focus in particular on the stationary state of the system in the long-time limit, which in conventional SGD is out of equilibrium. We propose a new variation of Langevin dynamics (SGLD) that harnesses without replacement minibatching.
arXiv Detail & Related papers (2023-06-06T09:12:49Z)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons. Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z)
Bayesian Structure Learning with Generative Flow Networks [85.84396514570373]
In Bayesian structure learning, we are interested in inferring a distribution over the directed acyclic graph (DAG) from data. Recently, a class of probabilistic models, called Generative Flow Networks (GFlowNets), have been introduced as a general framework for generative modeling. We show that our approach, called DAG-GFlowNet, provides an accurate approximation of the posterior over DAGs.
arXiv Detail & Related papers (2022-02-28T15:53:10Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent. We show that SGD is biased towards a simple solution. We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z)
Augmented KRnet for density estimation and approximation [0.0]
We have proposed augmented KRnets including both discrete and continuous models. The exact invertibility has been achieved in the real NVP using a specific pattern to exchange information between two separated groups of dimensions. KRnet has been developed to enhance the information exchange among data dimensions by incorporating the Knothe-Rosenblatt rearrangement into the structure of the transport map.
arXiv Detail & Related papers (2021-05-26T22:20:16Z)
Adaptive deep density approximation for Fokker-Planck equations [0.0]
We present a novel deep density approximation strategy based on KRnet (ADDAKR) for solving the steady-state Fokker-Planck equation. We show that KRnet can efficiently estimate general high-dimensional density functions.
arXiv Detail & Related papers (2021-03-20T13:49:52Z)
A Kernel Framework to Quantify a Model's Local Predictive Uncertainty under Data Distributional Shifts [21.591460685054546]
Internal layer outputs of a trained neural network contain all of the information related to both its mapping function and its input data distribution. We propose a framework for predictive uncertainty quantification of a trained neural network that explicitly estimates the PDF of its raw prediction space. The kernel framework is observed to provide model uncertainty estimates with much greater precision based on the ability to detect model prediction errors.
arXiv Detail & Related papers (2021-03-02T00:31:53Z)
VAE-KRnet and its applications to variational Bayes [4.9545850065593875]
We have proposed a generative model, called VAE-KRnet, for density estimation or approximation. VAE is used a dimension reduction technique to capture the latent space, and KRnet is used to model the distribution of the latent variable. VAE-KRnet can be used as a density model to approximate either data distribution or an arbitrary probability density function.
arXiv Detail & Related papers (2020-06-29T23:14:36Z)
Revealing the Structure of Deep Neural Networks via Convex Duality [70.15611146583068]
We study regularized deep neural networks (DNNs) and introduce a convex analytic framework to characterize the structure of hidden layers. We show that a set of optimal hidden layer weights for a norm regularized training problem can be explicitly found as the extreme points of a convex set. We apply the same characterization to deep ReLU networks with whitened data and prove the same weight alignment holds.
arXiv Detail & Related papers (2020-02-22T21:13:44Z)
Quaternion Equivariant Capsule Networks for 3D Point Clouds [58.566467950463306]
We present a 3D capsule module for processing point clouds that is equivariant to 3D rotations and translations. We connect dynamic routing between capsules to the well-known Weiszfeld algorithm. Based on our operator, we build a capsule network that disentangles geometry from pose.
arXiv Detail & Related papers (2019-12-27T13:51:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.