Fast Direct: Query-Efficient Online Black-box Guidance for Diffusion-model Target Generation
- URL: http://arxiv.org/abs/2502.01692v5
- Date: Sat, 29 Mar 2025 05:45:56 GMT
- Title: Fast Direct: Query-Efficient Online Black-box Guidance for Diffusion-model Target Generation
- Authors: Kim Yong Tan, Yueming Lyu, Ivor Tsang, Yew-Soon Ong,
- Abstract summary: Existing guided diffusion models either rely on training the guidance model with pre-collected datasets or require the objective functions to be differentiable.<n>In this work, we propose a novel and simple algorithm, $textbfFast Direct$, for query-efficient online black-box target generation.<n>Our Fast Direct builds a pseudo-target on the data manifold to update the noise sequence of the diffusion model with a universal direction.
- Score: 27.773614349764234
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Guided diffusion-model generation is a promising direction for customizing the generation process of a pre-trained diffusion model to address specific downstream tasks. Existing guided diffusion models either rely on training the guidance model with pre-collected datasets or require the objective functions to be differentiable. However, for most real-world tasks, offline datasets are often unavailable, and their objective functions are often not differentiable, such as image generation with human preferences, molecular generation for drug discovery, and material design. Thus, we need an $\textbf{online}$ algorithm capable of collecting data during runtime and supporting a $\textbf{black-box}$ objective function. Moreover, the $\textbf{query efficiency}$ of the algorithm is also critical because the objective evaluation of the query is often expensive in real-world scenarios. In this work, we propose a novel and simple algorithm, $\textbf{Fast Direct}$, for query-efficient online black-box target generation. Our Fast Direct builds a pseudo-target on the data manifold to update the noise sequence of the diffusion model with a universal direction, which is promising to perform query-efficient guided generation. Extensive experiments on twelve high-resolution ($\small {1024 \times 1024}$) image target generation tasks and six 3D-molecule target generation tasks show $\textbf{6}\times$ up to $\textbf{10}\times$ query efficiency improvement and $\textbf{11}\times$ up to $\textbf{44}\times$ query efficiency improvement, respectively. Our implementation is publicly available at: https://github.com/kimyong95/guide-stable-diffusion/tree/fast-direct
Related papers
- Self-Guided Action Diffusion [53.38661283705301]
Self-guided action diffusion is a more efficient variant of bidirectional decoding tailored for diffusion-based policies.<n>Our method achieves up to 70% higher success rates than existing counterparts on challenging dynamic tasks.
arXiv Detail & Related papers (2025-08-17T00:39:15Z) - Diffusion Tree Sampling: Scalable inference-time alignment of diffusion models [13.312007032203857]
Adapting a pretrained diffusion model to new objectives at inference time remains an open problem in generative modeling.<n>We introduce a tree-based approach that samples from the reward-aligned target density by propagating terminal rewards back through the diffusion chain.<n>By reusing information from previous generations, we get an anytime algorithm that turns additional compute into steadily better samples.
arXiv Detail & Related papers (2025-06-25T17:59:10Z) - Intention-Conditioned Flow Occupancy Models [69.79049994662591]
Large-scale pre-training has fundamentally changed how machine learning research is done today.<n>Applying this same framework to reinforcement learning is appealing because it offers compelling avenues for addressing core challenges in RL.<n>Recent advances in generative AI have provided new tools for modeling highly complex distributions.
arXiv Detail & Related papers (2025-06-10T15:27:46Z) - Flexiffusion: Training-Free Segment-Wise Neural Architecture Search for Efficient Diffusion Models [50.260693393896716]
Diffusion models (DMs) are powerful generative models capable of producing high-fidelity images but constrained by high computational costs.<n>We propose Flexiffusion, a training-free NAS framework that jointly optimize generation schedules and model architectures without modifying pre-trained parameters.<n>Our work pioneers a resource-efficient paradigm for searching high-speed DMs without sacrificing quality.
arXiv Detail & Related papers (2025-06-03T06:02:50Z) - You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling with Gradient Shortcuts [13.191937642688279]
Diffusion models (DMs) have recently demonstrated remarkable success in modeling large-scale data distributions.<n>Many downstream tasks require guiding the generated content based on specific differentiable metrics, typically necessitating backpropagation during the generation process.<n>We propose a more efficient alternative that approaches the problem from the perspective of parallel denoising.
arXiv Detail & Related papers (2025-05-12T12:09:11Z) - Finding Local Diffusion Schrödinger Bridge using Kolmogorov-Arnold Network [7.651628106346734]
This paper proposes for the first time to find local Diffusion Schr"odinger Bridges (LDSB) in the diffusion path subspace.
The experiment shows that our LDSB significantly improves the quality and efficiency of image generation using the same pre-trained denoising network.
arXiv Detail & Related papers (2025-02-27T04:34:03Z) - Do We Need to Design Specific Diffusion Models for Different Tasks? Try ONE-PIC [77.8851460746251]
We propose a simple, efficient, and general approach to fine-tune diffusion models.<n> ONE-PIC enhances the inherited generative ability in the pretrained diffusion models without introducing additional modules.<n>Our method is simple and efficient which streamlines the adaptation process and achieves excellent performance with lower costs.
arXiv Detail & Related papers (2024-12-07T11:19:32Z) - Adaptively Controllable Diffusion Model for Efficient Conditional Image Generation [8.857237929151795]
We propose a new adaptive framework, $textitAdaptively Controllable Diffusion (AC-Diff) Model$, to automatically and fully control the generation process.
AC-Diff is expected to largely reduce the average number of generation steps and execution time while maintaining the same performance as done in the literature diffusion models.
arXiv Detail & Related papers (2024-11-19T21:26:30Z) - Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding [84.3224556294803]
Diffusion models excel at capturing the natural design spaces of images, molecules, DNA, RNA, and protein sequences.
We aim to optimize downstream reward functions while preserving the naturalness of these design spaces.
Our algorithm integrates soft value functions, which looks ahead to how intermediate noisy states lead to high rewards in the future.
arXiv Detail & Related papers (2024-08-15T16:47:59Z) - Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation [49.49868273653921]
Diffusion models are promising for joint trajectory prediction and controllable generation in autonomous driving.
We introduce Optimal Gaussian Diffusion (OGD) and Estimated Clean Manifold (ECM) Guidance.
Our methodology streamlines the generative process, enabling practical applications with reduced computational overhead.
arXiv Detail & Related papers (2024-08-01T17:59:59Z) - Diffusion-BBO: Diffusion-Based Inverse Modeling for Online Black-Box Optimization [20.45482366024264]
Online black-box optimization (BBO) aims to optimize an objective function by iteratively querying a black-box oracle in a sample-efficient way.<n>We propose Diffusion-BBO, a sample-efficient online BBO framework leveraging the conditional diffusion model as the inverse surrogate model.
arXiv Detail & Related papers (2024-06-30T06:58:31Z) - Improving GFlowNets for Text-to-Image Diffusion Alignment [48.42367859859971]
We explore techniques that do not directly maximize the reward but rather generate high-reward images with relatively high probability.<n>Our method could effectively align large-scale text-to-image diffusion models with given reward information.
arXiv Detail & Related papers (2024-06-02T06:36:46Z) - An Efficient Membership Inference Attack for the Diffusion Model by
Proximal Initialization [58.88327181933151]
In this paper, we propose an efficient query-based membership inference attack (MIA)
Experimental results indicate that the proposed method can achieve competitive performance with only two queries on both discrete-time and continuous-time diffusion models.
To the best of our knowledge, this work is the first to study the robustness of diffusion models to MIA in the text-to-speech task.
arXiv Detail & Related papers (2023-05-26T16:38:48Z) - $\Delta$-Patching: A Framework for Rapid Adaptation of Pre-trained
Convolutional Networks without Base Performance Loss [71.46601663956521]
Models pre-trained on large-scale datasets are often fine-tuned to support newer tasks and datasets that arrive over time.
We propose $Delta$-Patching for fine-tuning neural network models in an efficient manner, without the need to store model copies.
Our experiments show that $Delta$-Networks outperform earlier model patching work while only requiring a fraction of parameters to be trained.
arXiv Detail & Related papers (2023-03-26T16:39:44Z) - DiffusionRet: Generative Text-Video Retrieval with Diffusion Model [56.03464169048182]
Existing text-video retrieval solutions focus on maximizing the conditional likelihood, i.e., p(candidates|query)
We creatively tackle this task from a generative viewpoint and model the correlation between the text and the video as their joint probability p(candidates,query)
This is accomplished through a diffusion-based text-video retrieval framework (DiffusionRet), which models the retrieval task as a process of gradually generating joint distribution from noise.
arXiv Detail & Related papers (2023-03-17T10:07:19Z) - DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models.
We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn.
Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z) - ${\rm N{\small ode}S{\small ig}}$: Random Walk Diffusion meets Hashing
for Scalable Graph Embeddings [7.025709586759654]
$rm Nsmall odeSsmall ig$ is a scalable embedding model that computes binary node representations.
$rm Nsmall odeSsmall ig$ exploits random walk diffusion probabilities via stable random projection hashing.
arXiv Detail & Related papers (2020-10-01T09:07:37Z) - AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective.
We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.