Related papers: Efficient Flow Matching using Latent Variables

Efficient Flow Matching using Latent Variables

URL: http://arxiv.org/abs/2505.04486v2
Date: Fri, 23 May 2025 18:49:37 GMT
Title: Efficient Flow Matching using Latent Variables
Authors: Anirban Samaddar, Yixuan Sun, Viktor Nilsson, Sandeep Madireddy,
Abstract summary: We present $textttLatent-CFM$, which provides simplified training/inference strategies to incorporate multi-modal data structures.<n>We show that $textttLatent-CFM$ exhibits improved generation quality with significantly less training.
Score: 3.5817637191799605
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Flow matching models have shown great potential in image generation tasks among probabilistic generative models. However, most flow matching models in the literature do not explicitly model the underlying structure/manifold in the target data when learning the flow from a simple source distribution like the standard Gaussian. This leads to inefficient learning, especially for many high-dimensional real-world datasets, which often reside in a low-dimensional manifold. Existing strategies of incorporating manifolds, including data with underlying multi-modal distribution, often require expensive training and hence frequently lead to suboptimal performance. To this end, we present $\texttt{Latent-CFM}$, which provides simplified training/inference strategies to incorporate multi-modal data structures using pretrained deep latent variable models. Through experiments on multi-modal synthetic data and widely used image benchmark datasets, we show that $\texttt{Latent-CFM}$ exhibits improved generation quality with significantly less training (up to $\sim 50\%$ less) and computation than state-of-the-art flow matching models by incorporating extracted data features using pretrained lightweight latent variable models. Moving beyond natural images to generating fields arising from processes governed by physics, using a 2d Darcy flow dataset, we demonstrate that our approach generates more physically accurate samples than competitive approaches. In addition, through latent space analysis, we demonstrate that our approach can be used for conditional image generation conditioned on latent features, which adds interpretability to the generation process.

Related papers

SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models [51.74498855100541]
Large language models (LLMs) have shown strong reasoning capabilities when fine-tuned with reinforcement learning (RL)<n>We propose textbfSPaRFT, a self-paced learning framework that enables efficient learning based on the capability of the model being trained.
arXiv Detail & Related papers (2025-08-07T03:50:48Z)
Diffusion models for multivariate subsurface generation and efficient probabilistic inversion [0.0]
Diffusion models offer stable training and state-of-the-art performance for deep generative modeling tasks.<n>We introduce a likelihood approximation accounting for the noise-contamination that is inherent in diffusion modeling.<n>Our tests show significantly improved statistical robustness, enhanced sampling of the posterior probability density function.
arXiv Detail & Related papers (2025-07-21T17:10:16Z)
Dataset Distillation with Probabilistic Latent Features [9.318549327568695]
A compact set of synthetic data can effectively replace the original dataset in downstream classification tasks.<n>We propose a novel approach that models the joint distribution of latent features.<n>Our method achieves state-of-the-art cross architecture performance across a range of backbone architectures.
arXiv Detail & Related papers (2025-05-10T13:53:49Z)
Local Flow Matching Generative Models [19.859984725284896]
Local Flow Matching is a computational framework for density estimation based on flow-based generative models.<n>$textttLFM$ employs a simulation-free scheme and incrementally learns a sequence of Flow Matching sub-models.<n>We demonstrate the improved training efficiency and competitive generative performance of $textttLFM$ compared to FM.
arXiv Detail & Related papers (2024-10-03T14:53:10Z)
Fisher Flow Matching for Generative Modeling over Discrete Data [12.69975914345141]
We introduce Fisher-Flow, a novel flow-matching model for discrete data. Fisher-Flow takes a manifestly geometric perspective by considering categorical distributions over discrete data. We prove that the gradient flow induced by Fisher-Flow is optimal in reducing the forward KL divergence.
arXiv Detail & Related papers (2024-05-23T15:02:11Z)
DepthFM: Fast Monocular Depth Estimation with Flow Matching [22.206355073676082]
Current discriminative depth estimation methods often produce blurry artifacts, while generative approaches suffer from slow sampling due to curvatures in the noise-to-depth transport.<n>Our method addresses these challenges by framing depth estimation as a direct transport between image and depth distributions.<n>Our approach achieves competitive zero-shot performance on standard benchmarks of complex natural scenes while improving sampling efficiency and only requiring minimal synthetic data for training.
arXiv Detail & Related papers (2024-03-20T17:51:53Z)
Distribution-Aware Data Expansion with Diffusion Models [55.979857976023695]
We propose DistDiff, a training-free data expansion framework based on the distribution-aware diffusion model. DistDiff consistently enhances accuracy across a diverse range of datasets compared to models trained solely on original data.
arXiv Detail & Related papers (2024-03-11T14:07:53Z)
Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech. Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z)
StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning. This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models. Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z)
Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models [77.83923746319498]
We propose a framework called Diff-Instruct to instruct the training of arbitrary generative models. We show that Diff-Instruct results in state-of-the-art single-step diffusion-based models. Experiments on refining GAN models show that the Diff-Instruct can consistently improve the pre-trained generators of GAN models.
arXiv Detail & Related papers (2023-05-29T04:22:57Z)
ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data. Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z)
VQ-Flows: Vector Quantized Local Normalizing Flows [2.7998963147546148]
We introduce a novel statistical framework for learning a mixture of local normalizing flows as "chart maps" over a data manifold. Our framework augments the expressivity of recent approaches while preserving the signature property of normalizing flows, that they admit exact density evaluation.
arXiv Detail & Related papers (2022-03-22T09:22:18Z)
How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets. In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset. We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z)
Flow-based Generative Models for Learning Manifold to Manifold Mappings [39.60406116984869]
We introduce three kinds of invertible layers for manifold-valued data, which are analogous to their functionality in flow-based generative models. We show promising results where we can reliably and accurately reconstruct brain images of a field of orientation distribution functions.
arXiv Detail & Related papers (2020-12-18T02:19:18Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
Flows for simultaneous manifold learning and density estimation [12.451050883955071]
manifold-learning flows (M-flows) represent datasets with a manifold structure more faithfully. M-flows learn the data manifold and allow for better inference than standard flows in the ambient data space.
arXiv Detail & Related papers (2020-03-31T02:07:48Z)
Semi-Supervised Learning with Normalizing Flows [54.376602201489995]
FlowGMM is an end-to-end approach to generative semi supervised learning with normalizing flows. We show promising results on a wide range of applications, including AG-News and Yahoo Answers text data.
arXiv Detail & Related papers (2019-12-30T17:36:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.