Controllable Stylistic Text Generation with Train-Time Attribute-Regularized Diffusion
- URL: http://arxiv.org/abs/2510.06386v1
- Date: Tue, 07 Oct 2025 19:09:19 GMT
- Title: Controllable Stylistic Text Generation with Train-Time Attribute-Regularized Diffusion
- Authors: Fan Zhou, Chang Tian, Tim Van de Cruys,
- Abstract summary: Generating stylistic text with specific attributes is a key problem in controllable text generation.<n>RegDiff is a regularized diffusion framework that leverages attribute features without requiring a pretrained classifier during sampling.<n>Experiments on five datasets spanning multiple stylistic attributes demonstrate that RegDiff outperforms strong baselines in generating stylistic texts.
- Score: 10.081440429184587
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Generating stylistic text with specific attributes is a key problem in controllable text generation. Recently, diffusion models have emerged as a powerful paradigm for both visual and textual generation. Existing approaches can be broadly categorized into classifier-free guidance (CFG) and classifier guidance (CG) methods. While CFG effectively preserves semantic content, it often fails to provide effective attribute control. In contrast, CG modifies the denoising trajectory using classifier gradients, enabling better attribute alignment but incurring high computational costs during sampling and suffering from classifier generalization issues. In this work, we propose RegDiff, a regularized diffusion framework that leverages attribute features without requiring a pretrained classifier during sampling, thereby achieving controllable generation with reduced computational costs. Specifically, RegDiff employs a VAE-based encoder--decoder architecture to ensure reconstruction fidelity and a latent diffusion model trained with attribute supervision to enable controllable text generation. Attribute information is injected only during training. Experiments on five datasets spanning multiple stylistic attributes demonstrate that RegDiff outperforms strong baselines in generating stylistic texts. These results validate the effectiveness of RegDiff as an efficient solution for attribute-controllable text diffusion. Our code, datasets, and resources will be released upon publication at https://github.com/xxxx.
Related papers
- RepreGuard: Detecting LLM-Generated Text by Revealing Hidden Representation Patterns [50.401907401444404]
Large language models (LLMs) are crucial for preventing misuse and building trustworthy AI systems.<n>We propose RepreGuard, an efficient statistics-based detection method.<n> Experimental results show that RepreGuard outperforms all baselines with average 94.92% AUROC on both in-distribution (ID) and OOD scenarios.
arXiv Detail & Related papers (2025-08-18T17:59:15Z) - DTPA: Dynamic Token-level Prefix Augmentation for Controllable Text Generation [10.166289358891238]
Controllability of texts generated by Air-Decoding tends to decline with increasing sequence length.<n>Different types of prefixes including soft and hard prefixes are also key factors influencing performance.<n>We propose Dynamic Token-level Prefix Augmentation (DTPA) based on Air-Decoding for controllable text generation.
arXiv Detail & Related papers (2025-08-06T03:20:33Z) - Towards Transformer-Based Aligned Generation with Self-Coherence Guidance [51.42269790543461]
We introduce a training-free approach for enhancing alignment in Transformer-based Text-Guided Diffusion Models (TGDMs)<n>Existing TGDMs often struggle to generate semantically aligned images, particularly when dealing with complex text prompts or multi-concept attribute binding challenges.<n>Our method addresses these challenges by directly optimizing cross-attention maps during the generation process.
arXiv Detail & Related papers (2025-03-22T07:03:57Z) - Exploring Diffusion Time-steps for Unsupervised Representation Learning [72.43246871893936]
We build a theoretical framework that connects the diffusion time-steps and the hidden attributes.
On CelebA, FFHQ, and Bedroom datasets, the learned feature significantly improves classification.
arXiv Detail & Related papers (2024-01-21T08:35:25Z) - Air-Decoding: Attribute Distribution Reconstruction for Decoding-Time
Controllable Text Generation [58.911255139171075]
Controllable text generation (CTG) aims to generate text with desired attributes.
We propose a novel lightweight decoding framework named Air-Decoding.
Our method achieves a new state-of-the-art control performance.
arXiv Detail & Related papers (2023-10-23T12:59:11Z) - DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for
Controllable Text Generation [6.844825905212349]
We propose a new CTG approach, namely DisCup, which incorporates the attribute knowledge of discriminator to optimize the control-prompts.
DisCup can achieve a new state-of-the-art control performance while maintaining an efficient and high-quality text generation, only relying on around 10 virtual tokens.
arXiv Detail & Related papers (2022-10-18T02:59:06Z) - FAST: Improving Controllability for Text Generation with Feedback Aware
Self-Training [25.75982440355576]
Controllable text generation systems often leverage control codes to direct various properties of the output like style and length.
Inspired by recent work on causal inference for NLP, this paper reveals a previously overlooked flaw in these control code-based conditional text generation algorithms.
We propose two simple techniques to reduce these correlations in training sets.
arXiv Detail & Related papers (2022-10-06T19:00:51Z) - Classifiers are Better Experts for Controllable Text Generation [63.17266060165098]
We show that the proposed method significantly outperforms recent PPLM, GeDi, and DExperts on PPL and sentiment accuracy based on the external classifier of generated texts.
The same time, it is also easier to implement and tune, and has significantly fewer restrictions and requirements.
arXiv Detail & Related papers (2022-05-15T12:58:35Z) - Multi-Attribute Balanced Sampling for Disentangled GAN Controls [0.0]
Various controls over the generated data can be extracted from the latent space of a pre-trained GAN.
We show that this approach outperforms state-of-the-art classifier-based methods while avoiding the need for disentanglement-enforcing post-processing.
arXiv Detail & Related papers (2021-10-28T08:44:13Z) - Latent Template Induction with Gumbel-CRFs [107.17408593510372]
We explore the use of structured variational autoencoders to infer latent templates for sentence generation.
As a structured inference network, we show that it learns interpretable templates during training.
arXiv Detail & Related papers (2020-11-29T01:00:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.