Diffusion Large Language Models for Black-Box Optimization
- URL: http://arxiv.org/abs/2601.14446v1
- Date: Tue, 20 Jan 2026 19:59:29 GMT
- Title: Diffusion Large Language Models for Black-Box Optimization
- Authors: Ye Yuan, Can, Chen, Zipeng Sun, Dinghuai Zhang, Christopher Pal, Xue Liu,
- Abstract summary: offline black-box optimization aims to find optimal designs based solely on an offline dataset of designs and their labels.<n>Traditional methods rely on task-specific proxy or generative models, overlooking the in-context learning capabilities of pre-trained large language models (LLMs)
- Score: 39.443620489103985
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Offline black-box optimization (BBO) aims to find optimal designs based solely on an offline dataset of designs and their labels. Such scenarios frequently arise in domains like DNA sequence design and robotics, where only a few labeled data points are available. Traditional methods typically rely on task-specific proxy or generative models, overlooking the in-context learning capabilities of pre-trained large language models (LLMs). Recent efforts have adapted autoregressive LLMs to BBO by framing task descriptions and offline datasets as natural language prompts, enabling direct design generation. However, these designs often contain bidirectional dependencies, which left-to-right models struggle to capture. In this paper, we explore diffusion LLMs for BBO, leveraging their bidirectional modeling and iterative refinement capabilities. This motivates our in-context denoising module: we condition the diffusion LLM on the task description and the offline dataset, both formatted in natural language, and prompt it to denoise masked designs into improved candidates. To guide the generation toward high-performing designs, we introduce masked diffusion tree search, which casts the denoising process as a step-wise Monte Carlo Tree Search that dynamically balances exploration and exploitation. Each node represents a partially masked design, each denoising step is an action, and candidates are evaluated via expected improvement under a Gaussian Process trained on the offline dataset. Our method, dLLM, achieves state-of-the-art results in few-shot settings on design-bench.
Related papers
- WeMMU: Enhanced Bridging of Vision-Language Models and Diffusion Models via Noisy Query Tokens [69.97021957331326]
We propose Noisy Query Tokens, which learn a distributed representation space between the VLM and Diffusion Model via end-to-end optimization.<n>We also introduce a VAE branch with linear projection to recover fine-grained image details.
arXiv Detail & Related papers (2025-12-02T09:02:20Z) - A Survey on Diffusion Language Models [30.00199970146068]
Diffusion Language Models (DLMs) are an alternative to the dominant autoregressive (AR) paradigm.<n>DLMs possess inherent advantages in reducing inference latency and capturing bidirectional context.<n>Recent advancements have allowed DLMs to show performance comparable to their autoregressive counterparts.
arXiv Detail & Related papers (2025-08-14T17:47:22Z) - DRUM: Learning Demonstration Retriever for Large MUlti-modal Models [10.884258583493175]
We propose a novel framework, underlinedemonstration underlineretriever for large munderlineulti-modal underlinemodel (DRUM)<n>First, we discuss the retrieval strategies for a visual-language task, assuming an embedding model is given. And we propose to concate the image and text embeddings to enhance the retrieval performance.<n>Second, we propose to re-rank the demonstrations retrieved by the embedding model via the LVLM's feedbacks, and calculate a list-wise ranking loss for training
arXiv Detail & Related papers (2024-12-10T15:56:12Z) - Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback [52.763620660061115]
ONI is a distributed architecture that simultaneously learns an RL policy and an intrinsic reward function.<n>We explore a range of algorithmic choices for reward modeling with varying complexity.<n>Our approach achieves state-of-the-art performance across a range of challenging tasks from the NetHack Learning Environment.
arXiv Detail & Related papers (2024-10-30T13:52:43Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.