Stochastic Interpolants via Conditional Dependent Coupling
- URL: http://arxiv.org/abs/2509.23122v1
- Date: Sat, 27 Sep 2025 05:03:08 GMT
- Title: Stochastic Interpolants via Conditional Dependent Coupling
- Authors: Chenrui Ma, Xi Xiao, Tianyang Wang, Xiao Wang, Yanning Shen,
- Abstract summary: Existing image generation models face critical challenges regarding the trade-off between computation and fidelity.<n>We introduce a unified multistage generative framework based on our proposed Conditional Dependent Coupling strategy.<n>It decomposes the generative process into interpolant trajectories at multiple stages, ensuring accurate distribution learning while enabling end-to-end optimization.
- Score: 36.84747986070112
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing image generation models face critical challenges regarding the trade-off between computation and fidelity. Specifically, models relying on a pretrained Variational Autoencoder (VAE) suffer from information loss, limited detail, and the inability to support end-to-end training. In contrast, models operating directly in the pixel space incur prohibitive computational cost. Although cascade models can mitigate computational cost, stage-wise separation prevents effective end-to-end optimization, hampers knowledge sharing, and often results in inaccurate distribution learning within each stage. To address these challenges, we introduce a unified multistage generative framework based on our proposed Conditional Dependent Coupling strategy. It decomposes the generative process into interpolant trajectories at multiple stages, ensuring accurate distribution learning while enabling end-to-end optimization. Importantly, the entire process is modeled as a single unified Diffusion Transformer, eliminating the need for disjoint modules and also enabling knowledge sharing. Extensive experiments demonstrate that our method achieves both high fidelity and efficiency across multiple resolutions.
Related papers
- Adapformer: Adaptive Channel Management for Multivariate Time Series Forecasting [49.40321003932633]
Adapformer is an advanced Transformer-based framework that merges the benefits of CI and CD methodologies through effective channel management.<n>Adapformer achieves superior performance over existing models, enhancing both predictive accuracy and computational efficiency.
arXiv Detail & Related papers (2025-11-18T16:24:05Z) - HardFlow: Hard-Constrained Sampling for Flow-Matching Models via Trajectory Optimization [4.249024052507976]
We introduce a novel framework that reformulates hard-constrained sampling as a trajectory optimization problem.<n>Our key insight is to leverage numerical optimal control to steer the sampling trajectory so that constraints are satisfied precisely at the terminal time.<n>Our algorithm, which we name $textitHardFlow$, substantially outperforms existing methods in both constraint satisfaction and sample quality.
arXiv Detail & Related papers (2025-11-11T16:33:57Z) - Nonparametric Data Attribution for Diffusion Models [57.820618036556084]
Data attribution for generative models seeks to quantify the influence of individual training examples on model outputs.<n>We propose a nonparametric attribution method that operates entirely on data, measuring influence via patch-level similarity between generated and training images.
arXiv Detail & Related papers (2025-10-16T03:37:16Z) - Composition and Alignment of Diffusion Models using Constrained Learning [79.36736636241564]
Diffusion models have become prevalent in generative modeling due to their ability to sample from complex distributions.<n>Two commonly used methods are: (i) alignment, which involves fine-tuning a diffusion model to align it with a reward; and (ii) composition, which combines several pre-trained diffusion models, each emphasizing a desirable attribute in the generated outputs.<n>We propose a constrained optimization framework that unifies alignment and composition of diffusion models by enforcing that the aligned model satisfies reward constraints and/or remains close to (potentially multiple) pre-trained models.
arXiv Detail & Related papers (2025-08-26T15:06:30Z) - Hybrid Autoregressive-Diffusion Model for Real-Time Sign Language Production [0.0]
We develop a hybrid approach that combines autoregressive and diffusion models for Sign Language Production (SLP)<n>To capture fine-grained body movements, we design a Multi-Scale Pose Representation module that separately extracts detailed features from distinct articulators.<n>We introduce a Confidence-Aware Causal Attention mechanism that utilizes joint-level confidence scores to dynamically guide the pose generation process.
arXiv Detail & Related papers (2025-07-12T01:34:50Z) - Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent [72.10987117380584]
Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data.<n>We find existing methods discard task-specific information that, while causing conflicts, is crucial for performance.<n>Our approach consistently outperforms previous methods, achieving state-of-the-art results across diverse architectures and tasks in both vision and NLP domains.
arXiv Detail & Related papers (2025-01-02T12:45:21Z) - Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models [93.76814568163353]
We propose a novel bilevel optimization framework for pruned diffusion models.<n>This framework consolidates the fine-tuning and unlearning processes into a unified phase.<n>It is compatible with various pruning and concept unlearning methods.
arXiv Detail & Related papers (2024-12-19T19:13:18Z) - Inverse design with conditional cascaded diffusion models [0.0]
Adjoint-based design optimizations are usually computationally expensive and those costs scale with resolution.
We extend the use of diffusion models over traditional generative models by proposing the conditional cascaded diffusion model (cCDM)
Our study compares cCDM against a cGAN model with transfer learning.
While both models show decreased performance with reduced high-resolution training data, the cCDM loses its superiority to the cGAN model with transfer learning when training data is limited.
arXiv Detail & Related papers (2024-08-16T04:54:09Z) - Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion [53.33473557562837]
Solving multi-objective optimization problems for large deep neural networks is a challenging task due to the complexity of the loss landscape and the expensive computational cost.
We propose a practical and scalable approach to solve this problem via mixture of experts (MoE) based model fusion.
By ensembling the weights of specialized single-task models, the MoE module can effectively capture the trade-offs between multiple objectives.
arXiv Detail & Related papers (2024-06-14T07:16:18Z) - Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling [2.91204440475204]
Diffusion Probabilistic Models (DPMs) have emerged as a powerful class of deep generative models.
They rely on sequential denoising steps during sample generation.
We propose a novel method that integrates denoising phases directly into the model's architecture.
arXiv Detail & Related papers (2024-05-31T08:19:44Z) - AdaDiff: Accelerating Diffusion Models through Step-Wise Adaptive Computation [32.74923906921339]
Diffusion models achieve great success in generating diverse and high-fidelity images, yet their widespread application is hampered by their inherently slow generation speed.
We propose AdaDiff, an adaptive framework that dynamically allocates computation resources in each sampling step to improve the generation efficiency of diffusion models.
arXiv Detail & Related papers (2023-09-29T09:10:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.