Few-step Flow for 3D Generation via Marginal-Data Transport Distillation
- URL: http://arxiv.org/abs/2509.04406v1
- Date: Thu, 04 Sep 2025 17:24:31 GMT
- Title: Few-step Flow for 3D Generation via Marginal-Data Transport Distillation
- Authors: Zanwei Zhou, Taoran Yi, Jiemin Fang, Chen Yang, Lingxi Xie, Xinggang Wang, Wei Shen, Qi Tian,
- Abstract summary: We propose a novel framework, MDT-dist, for few-step 3D flow distillation.<n>Our approach is built upon a primary objective: distilling the pretrained model to learn the Marginal-Data Transport.<n>Our method reduces sampling steps of each flow transformer from 25 to 1 or 2, achieving 0.68s (1 step x 2) and 0.94s (2 steps x 2) latency with 9.0x and 6.5x speedup on A800.
- Score: 104.76254102015794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Flow-based 3D generation models typically require dozens of sampling steps during inference. Though few-step distillation methods, particularly Consistency Models (CMs), have achieved substantial advancements in accelerating 2D diffusion models, they remain under-explored for more complex 3D generation tasks. In this study, we propose a novel framework, MDT-dist, for few-step 3D flow distillation. Our approach is built upon a primary objective: distilling the pretrained model to learn the Marginal-Data Transport. Directly learning this objective needs to integrate the velocity fields, while this integral is intractable to be implemented. Therefore, we propose two optimizable objectives, Velocity Matching (VM) and Velocity Distillation (VD), to equivalently convert the optimization target from the transport level to the velocity and the distribution level respectively. Velocity Matching (VM) learns to stably match the velocity fields between the student and the teacher, but inevitably provides biased gradient estimates. Velocity Distillation (VD) further enhances the optimization process by leveraging the learned velocity fields to perform probability density distillation. When evaluated on the pioneer 3D generation framework TRELLIS, our method reduces sampling steps of each flow transformer from 25 to 1 or 2, achieving 0.68s (1 step x 2) and 0.94s (2 steps x 2) latency with 9.0x and 6.5x speedup on A800, while preserving high visual and geometric fidelity. Extensive experiments demonstrate that our method significantly outperforms existing CM distillation methods, and enables TRELLIS to achieve superior performance in few-step 3D generation.
Related papers
- FlowConsist: Make Your Flow Consistent with Real Trajectory [99.22869983378062]
We argue that current fast-flow training paradigms suffer from two fundamental issues.<n> conditional velocities constructed from randomly paired noise-data samples introduce systematic trajectory drift.<n>We propose FlowConsist, a training framework designed to enforce trajectory consistency in fast flows.
arXiv Detail & Related papers (2026-02-06T03:24:23Z) - Transition Matching Distillation for Fast Video Generation [63.1049790376783]
We present Transition Matching Distillation (TMD), a novel framework for distilling video diffusion models into efficient few-step generators.<n>TMD matches the multi-step denoising trajectory of a diffusion model with a few-step probability transition process.<n>TMD provides a flexible and strong trade-off between generation speed and visual quality.
arXiv Detail & Related papers (2026-01-14T21:30:03Z) - Joint Distillation for Fast Likelihood Evaluation and Sampling in Flow-based Models [100.28111930893188]
Some of today's best generative models still require hundreds to thousands of neural function evaluations to compute a single likelihood.<n>We present fast flow joint distillation (F2D2), a framework that simultaneously reduces the number of NFEs required for both sampling and likelihood evaluation by two orders of magnitude.<n>F2D2 is modular, compatible with existing flow-based few-step sampling models, and requires only an additional divergence prediction head.
arXiv Detail & Related papers (2025-12-02T10:48:20Z) - MeanFlow Transformers with Representation Autoencoders [71.45823902973349]
MeanFlow (MF) is a diffusion-motivated generative model that enables efficient few-step generation by learning long jumps directly from noise to data.<n>We develop an efficient training and sampling scheme for MF in the latent space of a Representation Autoencoder (RAE)<n>We achieve a 1-step FID of 2.03, outperforming vanilla MF's 3.43, while reducing sampling GFLOPS by 38% and total training cost by 83% on ImageNet 256.
arXiv Detail & Related papers (2025-11-17T06:17:08Z) - Mean Flows for One-step Generative Modeling [64.4997821467102]
We propose a principled and effective framework for one-step generative modeling.<n>A well-defined identity between average and instantaneous velocities is derived and used to guide neural network training.<n>Our method, termed the MeanFlow model, is self-contained and requires no pre-training, distillation, or curriculum learning.
arXiv Detail & Related papers (2025-05-19T17:59:42Z) - Distilling Multi-view Diffusion Models into 3D Generators [4.3238419212557115]
We introduce DD3G, a formulation that Distills a multi-view Diffusion model (MV-DM) into a 3D Generator using gaussian splatting.<n> DD3G compresses and integrates extensive visual and spatial knowledge from the MV-DM.<n>We propose PEPD, a generator consisting of Pattern Extraction and Progressive Decoding phases, which enables efficient fusion of probabilistic flow.
arXiv Detail & Related papers (2025-04-01T06:32:48Z) - Learning Few-Step Diffusion Models by Trajectory Distribution Matching [18.229753357571116]
Trajectory Distribution Matching (TDM) is a unified distillation paradigm that combines the strengths of distribution and trajectory matching.<n>We develop a sampling-steps-aware objective that decouples learning targets across different steps, enabling more adjustable sampling.<n>Our model, TDM, outperforms existing methods on various backbones, delivering superior quality and significantly reduced training costs.
arXiv Detail & Related papers (2025-03-09T15:53:49Z) - ProReflow: Progressive Reflow with Decomposed Velocity [52.249464542399636]
Flow matching aims to reflow the diffusion process of diffusion models into a straight line for a few-step and even one-step generation.<n>We introduce progressive reflow, which progressively reflows the diffusion models in local timesteps until the whole diffusion progresses.<n>We also introduce aligned v-prediction, which highlights the importance of direction matching in flow matching over magnitude matching.
arXiv Detail & Related papers (2025-03-05T04:50:53Z) - TraFlow: Trajectory Distillation on Pre-Trained Rectified Flow [31.56008127287467]
We propose a trajectory distillation method, modelname, that enjoys the benefits of both and enables few-step generations.<n>TraFlow adopts the settings of consistency trajectory models, and further enforces the properties of self-consistency and straightness throughout the entire trajectory.
arXiv Detail & Related papers (2025-02-24T08:57:19Z) - Self-Refining Diffusion Samplers: Enabling Parallelization via Parareal Iterations [53.180374639531145]
Self-Refining Diffusion Samplers (SRDS) retain sample quality and can improve latency at the cost of additional parallel compute.<n>We take inspiration from the Parareal algorithm, a popular numerical method for parallel-in-time integration of differential equations.
arXiv Detail & Related papers (2024-12-11T11:08:09Z) - FlowDreamer: Exploring High Fidelity Text-to-3D Generation via Rectified Flow [17.919092916953183]
We propose a novel framework, named FlowDreamer, which yields high fidelity results with richer textual details and faster convergence.
Key insight is to leverage the coupling and reversible properties of the rectified flow model to search for the corresponding noise.
We introduce a novel Unique Matching Couple (UCM) loss, which guides the 3D model to optimize along the same trajectory.
arXiv Detail & Related papers (2024-08-09T11:40:20Z) - Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis [20.2271205957037]
Hyper-SD is a novel framework that amalgamates the advantages of ODE Trajectory Preservation and Reformulation.
We introduce Trajectory Segmented Consistency Distillation to progressively perform consistent distillation within pre-defined time-step segments.
We incorporate human feedback learning to boost the performance of the model in a low-step regime and mitigate the performance loss incurred by the distillation process.
arXiv Detail & Related papers (2024-04-21T15:16:05Z) - Fast Sampling of Diffusion Models via Operator Learning [74.37531458470086]
We use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion models.
Compared to other fast sampling methods that have a sequential nature, we are the first to propose a parallel decoding method.
We show our method achieves state-of-the-art FID of 3.78 for CIFAR-10 and 7.83 for ImageNet-64 in the one-model-evaluation setting.
arXiv Detail & Related papers (2022-11-24T07:30:27Z) - Towards Fast, Accurate and Stable 3D Dense Face Alignment [73.01620081047336]
We propose a novel regression framework named 3DDFA-V2 which makes a balance among speed, accuracy and stability.
We present a virtual synthesis method to transform one still image to a short-video which incorporates in-plane and out-of-plane face moving.
arXiv Detail & Related papers (2020-09-21T15:37:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.