An Introduction to Flow Matching and Diffusion Models
- URL: http://arxiv.org/abs/2506.02070v2
- Date: Sat, 12 Jul 2025 16:37:24 GMT
- Title: An Introduction to Flow Matching and Diffusion Models
- Authors: Peter Holderrieth, Ezra Erives,
- Abstract summary: This tutorial provides a self-contained introduction to diffusion and flow-based generative models from first principles.<n>We develop the necessary mathematical background in ordinary and differential equations and derive the core algorithms of flow matching and denoising diffusion models.<n>We then provide a step-by-step guide to building image and video generators, including training methods, guidance, and architectural design.
- Score: 1.2277343096128712
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Diffusion and flow-based models have become the state of the art for generative AI across a wide range of data modalities, including images, videos, shapes, molecules, music, and more. This tutorial provides a self-contained introduction to diffusion and flow-based generative models from first principles. We systematically develop the necessary mathematical background in ordinary and stochastic differential equations and derive the core algorithms of flow matching and denoising diffusion models. We then provide a step-by-step guide to building image and video generators, including training methods, guidance, and architectural design. This tutorial is ideal for machine learning researchers who want to develop a principled understanding of the theory and practice of generative AI.
Related papers
- Deep generative models as the probability transformation functions [0.0]
This paper introduces a unified theoretical perspective that views deep generative models as probability transformation functions.<n>We demonstrate that they all fundamentally operate by transforming simple predefined distributions into complex target data distributions.
arXiv Detail & Related papers (2025-06-20T17:22:23Z) - DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion [53.70278210626701]
We propose a data-driven multi-view reasoning approach that directly infers 3D scene geometry and camera poses from multi-view images.<n>Our framework, DiffusionSfM, parameterizes scene geometry and cameras as pixel-wise ray origins and endpoints in a global frame.<n>We empirically validate DiffusionSfM on both synthetic and real datasets, demonstrating that it outperforms classical and learning-based approaches.
arXiv Detail & Related papers (2025-05-08T17:59:47Z) - Flow-based generative models as iterative algorithms in probability space [18.701755188870823]
Flow-based generative models offer exact likelihood estimation, efficient sampling, and deterministic transformations.<n>This tutorial presents an intuitive mathematical framework for flow-based generative models.<n>We aim to equip researchers and practitioners with the necessary tools to effectively apply flow-based generative models in signal processing and machine learning.
arXiv Detail & Related papers (2025-02-19T03:09:18Z) - Inference-Time Alignment in Diffusion Models with Reward-Guided Generation: Tutorial and Review [59.856222854472605]
This tutorial provides an in-depth guide on inference-time guidance and alignment methods for optimizing downstream reward functions in diffusion models.<n> practical applications in fields such as biology often require sample generation that maximizes specific metrics.<n>We discuss (1) fine-tuning methods combined with inference-time techniques, (2) inference-time algorithms based on search algorithms such as Monte Carlo tree search, and (3) connections between inference-time algorithms in language models and diffusion models.
arXiv Detail & Related papers (2025-01-16T17:37:35Z) - Generative Diffusion Modeling: A Practical Handbook [25.81859481634996]
diffusion probabilistic models, score-based generative models, consistency models, rectified flow, and related methods.<n>Content encompasses the fundamentals of diffusion models, the pre-training process, and various post-training methods.<n>Designed as a practical guide, it emphasizes clarity and usability over theoretical depth.
arXiv Detail & Related papers (2024-12-22T21:02:36Z) - Diffusion Model from Scratch [0.0]
Diffusion generative models are currently the most popular generative models.<n>This paper aims to assist readers in building a foundational understanding of generative models by tracing the evolution from VAEs to DDPM.
arXiv Detail & Related papers (2024-12-14T13:05:05Z) - Physics Informed Distillation for Diffusion Models [21.173298037358954]
We introduce Physics Informed Distillation (PID), which employs a student model to represent the solution of the ODE system corresponding to the teacher diffusion model.
We observe that PID performance achieves comparable to recent distillation methods.
arXiv Detail & Related papers (2024-11-13T07:03:47Z) - Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models [63.43422118066493]
Machine unlearning (MU) is a crucial foundation for developing safe, secure, and trustworthy GenAI models.<n>Traditional MU methods often rely on stringent assumptions and require access to real data.<n>This paper introduces Score Forgetting Distillation (SFD), an innovative MU approach that promotes the forgetting of undesirable information in diffusion models.
arXiv Detail & Related papers (2024-09-17T14:12:50Z) - Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding [84.3224556294803]
Diffusion models excel at capturing the natural design spaces of images, molecules, DNA, RNA, and protein sequences.
We aim to optimize downstream reward functions while preserving the naturalness of these design spaces.
Our algorithm integrates soft value functions, which looks ahead to how intermediate noisy states lead to high rewards in the future.
arXiv Detail & Related papers (2024-08-15T16:47:59Z) - An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization [59.63880337156392]
Diffusion models have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology.
Despite the significant empirical success, theory of diffusion models is very limited.
This paper provides a well-rounded theoretical exposure for stimulating forward-looking theories and methods of diffusion models.
arXiv Detail & Related papers (2024-04-11T14:07:25Z) - Applied Causal Inference Powered by ML and AI [54.88868165814996]
The book presents ideas from classical structural equation models (SEMs) and their modern AI equivalent, directed acyclical graphs (DAGs) and structural causal models (SCMs)
It covers Double/Debiased Machine Learning methods to do inference in such models using modern predictive tools.
arXiv Detail & Related papers (2024-03-04T20:28:28Z) - Demystifying Variational Diffusion Models [19.977841588918373]
Most existing work on diffusion models focuses on either applications or theoretical contributions.<n>We revisit predecessors to diffusion models like hierarchical latent variable models and synthesize a holistic perspective.<n>The resulting narrative is easier to follow as it imposes fewer prerequisites on the average reader.
arXiv Detail & Related papers (2024-01-11T22:37:37Z) - Semantic Guidance Tuning for Text-To-Image Diffusion Models [3.3881449308956726]
We propose a training-free approach that modulates the guidance direction of diffusion models during inference.
We first decompose the prompt semantics into a set of concepts, and monitor the guidance trajectory in relation to each concept.
Based on this observation, we devise a technique to steer the guidance direction towards any concept from which the model diverges.
arXiv Detail & Related papers (2023-12-26T09:02:17Z) - Diffusion Models for Generative Artificial Intelligence: An Introduction
for Applied Mathematicians [3.069335774032178]
Diffusion models offer state of the art performance in generative AI for images.
We provide a brief introduction to diffusion models for applied mathematicians and statisticians.
arXiv Detail & Related papers (2023-12-21T20:20:52Z) - Guided Diffusion from Self-Supervised Diffusion Features [49.78673164423208]
Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or pretraining.
We propose a framework to extract guidance from, and specifically for, diffusion models.
arXiv Detail & Related papers (2023-12-14T11:19:11Z) - SODA: Bottleneck Diffusion Models for Representation Learning [75.7331354734152]
We introduce SODA, a self-supervised diffusion model, designed for representation learning.
The model incorporates an image encoder, which distills a source view into a compact representation, that guides the generation of related novel views.
We show that by imposing a tight bottleneck between the encoder and a denoising decoder, we can turn diffusion models into strong representation learners.
arXiv Detail & Related papers (2023-11-29T18:53:34Z) - Learning by Distillation: A Self-Supervised Learning Framework for
Optical Flow Estimation [71.76008290101214]
DistillFlow is a knowledge distillation approach to learning optical flow.
It achieves state-of-the-art unsupervised learning performance on both KITTI and Sintel datasets.
Our models ranked 1st among all monocular methods on the KITTI 2015 benchmark, and outperform all published methods on the Sintel Final benchmark.
arXiv Detail & Related papers (2021-06-08T09:13:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.