Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models
- URL: http://arxiv.org/abs/2412.07972v4
- Date: Sun, 09 Feb 2025 16:59:28 GMT
- Title: Phase-aware Training Schedule Simplifies Learning in Flow-Based Generative Models
- Authors: Santiago Aranguri, Francesco Insulla,
- Abstract summary: We analyze the training of a two-layer autoencoder used to parameterize a flow-based generative model.
We find that the autoencoder representing the velocity field learns to simplify by estimating only the parameters relevant to each phase.
- Score: 0.1534667887016089
- License:
- Abstract: We analyze the training of a two-layer autoencoder used to parameterize a flow-based generative model for sampling from a high-dimensional Gaussian mixture. Previous work shows that the phase where the relative probability between the modes is learned disappears as the dimension goes to infinity without an appropriate time schedule. We introduce a time dilation that solves this problem. This enables us to characterize the learned velocity field, finding a first phase where the probability of each mode is learned and a second phase where the variance of each mode is learned. We find that the autoencoder representing the velocity field learns to simplify by estimating only the parameters relevant to each phase. Turning to real data, we propose a method that, for a given feature, finds intervals of time where training improves accuracy the most on that feature. Since practitioners take a uniform distribution over training times, our method enables more efficient training. We provide preliminary experiments validating this approach.
Related papers
- Neural Flow Samplers with Shortcut Models [19.81513273510523]
Flow-based samplers generate samples by learning a velocity field that satisfies the continuity equation.
While importance sampling provides an approximation, it suffers from high variance.
arXiv Detail & Related papers (2025-02-11T07:55:41Z) - Truncated Consistency Models [57.50243901368328]
Training consistency models requires learning to map all intermediate points along PF ODE trajectories to their corresponding endpoints.
We empirically find that this training paradigm limits the one-step generation performance of consistency models.
We propose a new parameterization of the consistency function and a two-stage training procedure that prevents the truncated-time training from collapsing to a trivial solution.
arXiv Detail & Related papers (2024-10-18T22:38:08Z) - Consistency Flow Matching: Defining Straight Flows with Velocity Consistency [97.28511135503176]
We introduce Consistency Flow Matching (Consistency-FM), a novel FM method that explicitly enforces self-consistency in the velocity field.
Preliminary experiments demonstrate that our Consistency-FM significantly improves training efficiency by converging 4.4x faster than consistency models.
arXiv Detail & Related papers (2024-07-02T16:15:37Z) - Improving Consistency Models with Generator-Augmented Flows [16.049476783301724]
Consistency models imitate the multi-step sampling of score-based diffusion in a single forward pass of a neural network.
They can be learned in two ways: consistency distillation and consistency training.
We propose a novel flow that transports noisy data towards their corresponding outputs derived from a consistency model.
arXiv Detail & Related papers (2024-06-13T20:22:38Z) - Flow Map Matching [15.520853806024943]
Flow map matching is an algorithm that learns the two-time flow map of an underlying ordinary differential equation.
We show that flow map matching leads to high-quality samples with significantly reduced sampling cost compared to diffusion or interpolant methods.
arXiv Detail & Related papers (2024-06-11T17:41:26Z) - Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - Fast Sampling of Diffusion Models via Operator Learning [74.37531458470086]
We use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion models.
Compared to other fast sampling methods that have a sequential nature, we are the first to propose a parallel decoding method.
We show our method achieves state-of-the-art FID of 3.78 for CIFAR-10 and 7.83 for ImageNet-64 in the one-model-evaluation setting.
arXiv Detail & Related papers (2022-11-24T07:30:27Z) - Intersection of Parallels as an Early Stopping Criterion [64.8387564654474]
We propose a method to spot an early stopping point in the training iterations without the need for a validation set.
For a wide range of learning rates, our method, called Cosine-Distance Criterion (CDC), leads to better generalization on average than all the methods that we compare against.
arXiv Detail & Related papers (2022-08-19T19:42:41Z) - Accelerated Probabilistic Marching Cubes by Deep Learning for
Time-Varying Scalar Ensembles [5.102811033640284]
This paper introduces a deep-learning-based approach to learning the level-set uncertainty for two-dimensional ensemble data.
We train the model using the first few time steps from time-varying ensemble data in our workflow.
We demonstrate that our trained model accurately infers uncertainty in level sets for new time steps and is up to 170X faster than that of the original probabilistic model.
arXiv Detail & Related papers (2022-07-15T02:35:41Z) - Contrastive learning of strong-mixing continuous-time stochastic
processes [53.82893653745542]
Contrastive learning is a family of self-supervised methods where a model is trained to solve a classification task constructed from unlabeled data.
We show that a properly constructed contrastive learning task can be used to estimate the transition kernel for small-to-mid-range intervals in the diffusion case.
arXiv Detail & Related papers (2021-03-03T23:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.