Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling
- URL: http://arxiv.org/abs/2508.10995v2
- Date: Mon, 18 Aug 2025 15:41:22 GMT
- Title: Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling
- Authors: Tejomay Kishor Padole, Suyash P Awate, Pushpak Bhattacharyya,
- Abstract summary: Masked diffusion language models (MDMs) have recently gained traction as a viable generative framework for natural language.<n>We propose a verifier-based inference-time scaling method that aids in finding a better candidate generation during the denoising process of the MDM.<n>Our experiments demonstrate the application of MDMs for standard text-style transfer tasks and establish MDMs as a better alternative to autoregressive language models.
- Score: 37.795834398730555
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Masked diffusion language models (MDMs) have recently gained traction as a viable generative framework for natural language. This can be attributed to its scalability and ease of training compared to other diffusion model paradigms for discrete data, establishing itself as the state-of-the-art non-autoregressive generator for discrete data. Diffusion models, in general, have shown excellent ability to improve the generation quality by leveraging inference-time scaling either by increasing the number of denoising steps or by using external verifiers on top of the outputs of each step to guide the generation. In this work, we propose a verifier-based inference-time scaling method that aids in finding a better candidate generation during the denoising process of the MDM. Our experiments demonstrate the application of MDMs for standard text-style transfer tasks and establish MDMs as a better alternative to autoregressive language models. Additionally, we show that a simple soft-value-based verifier setup for MDMs using off-the-shelf pre-trained embedding models leads to significant gains in generation quality even when used on top of typical classifier-free guidance setups in the existing literature.
Related papers
- One-step Language Modeling via Continuous Denoising [36.18484491074519]
We show that language models leveraging flow-based continuous denoising can outperform discrete diffusion in both quality and speed.<n>Our work calls into question the widely held hypothesis that discrete diffusion processes are necessary for generative modeling over discrete modalities.
arXiv Detail & Related papers (2026-02-18T19:23:07Z) - Unifying Masked Diffusion Models with Various Generation Orders and Beyond [56.70289720766803]
Masked diffusion models (MDMs) are a potential alternative to autoregressive models (ARMs) for language generation.<n>We propose order-expressive masked diffusion model (OeMDM) for a broad class of diffusion generative processes.<n>We introduce learnable-order masked diffusion model (LoMDM) which jointly learns the generation ordering and diffusion backbone.
arXiv Detail & Related papers (2026-02-02T13:54:32Z) - Towards Latent Diffusion Suitable For Text [7.293508593001522]
We introduce Neural Flow Diffusion Models for language generation, an extension of NFDM that enables the straightforward application of continuous diffusion models to discrete state spaces.<n>Our model substantially reduces the likelihood gap with autoregressive models of the same size, while achieving sample quality comparable to that of previous latent diffusion models.
arXiv Detail & Related papers (2026-01-07T20:50:59Z) - Discrete Diffusion in Large Language and Multimodal Models: A Survey [61.86669998363359]
We provide a systematic survey of Discrete Diffusion Language Models (dLLMs) and Discrete Diffusion Multimodal Language Models (dMLLMs)<n>Unlike autoregressive (AR) models, dLLMs and dMLLMs adopt a multi-token, parallel decoding paradigm using full attention and a denoising-based generation strategy.<n>We trace the historical development of dLLMs and dMLLMs, formalize the underlying mathematical frameworks, list commonly-used modeling methods, and categorize representative models.
arXiv Detail & Related papers (2025-06-16T17:59:08Z) - On Designing Diffusion Autoencoders for Efficient Generation and Representation Learning [14.707830064594056]
Diffusion autoencoders (DAs) use an input-dependent latent variable to capture representations alongside the diffusion process.<n>Better generative modelling is the primary goal of another class of diffusion models -- those that learn their forward (noising) process.
arXiv Detail & Related papers (2025-05-30T18:14:09Z) - Variational Autoencoding Discrete Diffusion with Enhanced Dimensional Correlations Modeling [48.96034602889216]
Variencoding Discrete Diffusion (VADD) is a novel framework that enhances discrete diffusion with latent variable modeling.<n>By introducing an auxiliary recognition model, VADD enables stable training via variational lower bounds and amortized inference over the training set.<n> Empirical results on 2D toy data, pixel-level image generation, and text generation demonstrate that VADD consistently outperforms MDM baselines.
arXiv Detail & Related papers (2025-05-23T01:45:47Z) - Automated Learning of Semantic Embedding Representations for Diffusion Models [1.688134675717698]
We employ a multi-level denoising autoencoder framework to expand the representation capacity of denoising diffusion models.<n>Our work justifies that DDMs are not only suitable for generative tasks, but also potentially advantageous for general-purpose deep learning applications.
arXiv Detail & Related papers (2025-05-09T02:10:46Z) - Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference.
Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable.
We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z) - PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model [37.2192243883707]
We propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation to generate fluent text.
Results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text.
arXiv Detail & Related papers (2023-06-05T01:36:39Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.