Taming Hyperparameter Tuning in Continuous Normalizing Flows Using the
JKO Scheme
- URL: http://arxiv.org/abs/2211.16757v1
- Date: Wed, 30 Nov 2022 05:53:21 GMT
- Title: Taming Hyperparameter Tuning in Continuous Normalizing Flows Using the
JKO Scheme
- Authors: Alexander Vidal, Samy Wu Fung, Luis Tenorio, Stanley Osher, Levon
Nurbekyan
- Abstract summary: A normalizing flow (NF) is a mapping that transforms a chosen probability distribution to a normal distribution.
We present JKO-Flow, an algorithm to solve OT-based CNF without the need of tuning $alpha$.
- Score: 60.79981399724534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A normalizing flow (NF) is a mapping that transforms a chosen probability
distribution to a normal distribution. Such flows are a common technique used
for data generation and density estimation in machine learning and data
science. The density estimate obtained with a NF requires a change of variables
formula that involves the computation of the Jacobian determinant of the NF
transformation. In order to tractably compute this determinant, continuous
normalizing flows (CNF) estimate the mapping and its Jacobian determinant using
a neural ODE. Optimal transport (OT) theory has been successfully used to
assist in finding CNFs by formulating them as OT problems with a soft penalty
for enforcing the standard normal distribution as a target measure. A drawback
of OT-based CNFs is the addition of a hyperparameter, $\alpha$, that controls
the strength of the soft penalty and requires significant tuning. We present
JKO-Flow, an algorithm to solve OT-based CNF without the need of tuning
$\alpha$. This is achieved by integrating the OT CNF framework into a
Wasserstein gradient flow framework, also known as the JKO scheme. Instead of
tuning $\alpha$, we repeatedly solve the optimization problem for a fixed
$\alpha$ effectively performing a JKO update with a time-step $\alpha$. Hence
we obtain a "divide and conquer" algorithm by repeatedly solving simpler
problems instead of solving a potentially harder problem with large $\alpha$.
Related papers
- Entropy-Informed Weighting Channel Normalizing Flow [7.751853409569806]
We propose a regularized and feature-dependent $mathttShuffle$ operation and integrate it into vanilla multi-scale architecture.
We observe that such operation guides the variables to evolve in the direction of entropy increase, hence we refer to NFs with the $mathttShuffle$ operation as emphEntropy-Informed Weighting Channel Normalizing Flow (EIW-Flow)
arXiv Detail & Related papers (2024-07-06T04:46:41Z) - Sub-linear Regret in Adaptive Model Predictive Control [56.705978425244496]
We present STT-MPC (Self-Tuning Tube-based Model Predictive Control), an online oracle that combines the certainty-equivalence principle and polytopic tubes.
We analyze the regret of the algorithm, when compared to an algorithm initially aware of the system dynamics.
arXiv Detail & Related papers (2023-10-07T15:07:10Z) - FedNAR: Federated Optimization with Normalized Annealing Regularization [54.42032094044368]
We explore the choices of weight decay and identify that weight decay value appreciably influences the convergence of existing FL algorithms.
We develop Federated optimization with Normalized Annealing Regularization (FedNAR), a plug-in that can be seamlessly integrated into any existing FL algorithms.
arXiv Detail & Related papers (2023-10-04T21:11:40Z) - GQFedWAvg: Optimization-Based Quantized Federated Learning in General
Edge Computing Systems [11.177402054314674]
The optimal implementation of federated learning (FL) in practical edge computing has been an outstanding problem.
We propose an optimization quantized FL algorithm, which can appropriately fit a general edge computing system with uniform or nonuniform computing and communication systems.
arXiv Detail & Related papers (2023-06-13T02:18:24Z) - Green, Quantized Federated Learning over Wireless Networks: An
Energy-Efficient Design [68.86220939532373]
The finite precision level is captured through the use of quantized neural networks (QNNs) that quantize weights and activations in fixed-precision format.
The proposed FL framework can reduce energy consumption until convergence by up to 70% compared to a baseline FL algorithm.
arXiv Detail & Related papers (2022-07-19T16:37:24Z) - Self Normalizing Flows [65.73510214694987]
We propose a flexible framework for training normalizing flows by replacing expensive terms in the gradient by learned approximate inverses at each layer.
This reduces the computational complexity of each layer's exact update from $mathcalO(D3)$ to $mathcalO(D2)$.
We show experimentally that such models are remarkably stable and optimize to similar data likelihood values as their exact gradient counterparts.
arXiv Detail & Related papers (2020-11-14T09:51:51Z) - OT-Flow: Fast and Accurate Continuous Normalizing Flows via Optimal
Transport [8.468007443062751]
A normalizing flow is an invertible mapping between an arbitrary probability distribution and a standard normal distribution.
OT-Flow tackles two critical computational challenges that limit a more widespread use of CNFs.
On five high-dimensional density estimation and generative modeling tasks, OT-Flow performs competitively to state-of-the-art CNFs.
arXiv Detail & Related papers (2020-05-29T22:31:10Z) - Learning Likelihoods with Conditional Normalizing Flows [54.60456010771409]
Conditional normalizing flows (CNFs) are efficient in sampling and inference.
We present a study of CNFs where the base density to output space mapping is conditioned on an input x, to model conditional densities p(y|x)
arXiv Detail & Related papers (2019-11-29T19:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.