P-Masking: Power Law Masking Improves Multi-attribute Controlled Generation
- URL: http://arxiv.org/abs/2410.24201v1
- Date: Thu, 31 Oct 2024 17:55:45 GMT
- Title: P-Masking: Power Law Masking Improves Multi-attribute Controlled Generation
- Authors: Mohamed Elgaar, Hadi Amiri,
- Abstract summary: LingGen is a novel approach for controlled text generation that offers precise control over a wide array of linguistic attributes.
LingGen employs a dynamic P-MASKING strategy, which samples masking rates from a power law distribution during training.
Experiments demonstrate that LingGen surpasses current state-of-the-art models in both attribute control accuracy and text fluency.
- Score: 14.763505073094779
- License:
- Abstract: We introduce LingGen, a novel approach for controlled text generation that offers precise control over a wide array of linguistic attributes, even as the number of attributes varies. LingGen employs a dynamic P-MASKING strategy, which samples masking rates from a power law distribution during training. This innovative approach enables the model to develop robust representations and adapt its attribute control capabilities across a variable number of attributes, from a single attribute to multiple complex configurations. The P-MASKING technique enhances LingGen's ability to manage different levels of attribute visibility, resulting in superior performance in multi-attribute generation tasks. Our experiments demonstrate that LingGen surpasses current state-of-the-art models in both attribute control accuracy and text fluency, particularly excelling in scenarios with varying attribute demands. Additionally, our ablation studies highlight the effectiveness of P-MASKING and the influence of different base language models on performance. These findings demonstrate LingGen's potential for applications requiring precise and adaptable control over multiple linguistic attributes in text generation.
Related papers
- Towards Lightweight, Adaptive and Attribute-Aware Multi-Aspect Controllable Text Generation with Large Language Models [40.54453001537357]
Multi-aspect controllable text generation aims to control text generation in attributes from multiple aspects.
Supervised fine-tuning methods are often employed for this task due to their simplicity and effectiveness.
We propose a lightweight, adaptive and attribute-aware framework for multi-aspect controllable text generation.
arXiv Detail & Related papers (2025-02-19T06:56:02Z) - Multi-Attribute Constraint Satisfaction via Language Model Rewriting [67.5778646504987]
Multi-Attribute Constraint Satisfaction (MACS) is a method capable of finetuning language models to satisfy user-specified constraints on multiple external real-value attributes.
Our work opens new avenues for generalized and real-value multi-attribute control, with implications for diverse applications spanning NLP and bioinformatics.
arXiv Detail & Related papers (2024-12-26T12:36:39Z) - UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models [88.16197692794707]
UniGen is a comprehensive framework designed to produce diverse, accurate, and highly controllable datasets.
To augment data diversity, UniGen incorporates an attribute-guided generation module and a group checking feature.
Extensive experiments demonstrate the superior quality of data generated by UniGen.
arXiv Detail & Related papers (2024-06-27T07:56:44Z) - Successor Features for Efficient Multisubject Controlled Text Generation [48.37713738712319]
We introduce SF-GEN, which is grounded in two primary concepts: successor features (SFs) and language model rectification.
SF-GEN seamlessly integrates the two to enable dynamic steering of text generation with no need to alter the LLM's parameters.
To the best of our knowledge, our research represents the first application of successor features in text generation.
arXiv Detail & Related papers (2023-11-03T00:17:08Z) - MacLaSa: Multi-Aspect Controllable Text Generation via Efficient
Sampling from Compact Latent Space [110.85888003111653]
Multi-aspect controllable text generation aims to generate fluent sentences that possess multiple desired attributes simultaneously.
We introduce a novel approach for multi-aspect control, namely MacLaSa, that estimates compact latent space for multiple aspects.
We show that MacLaSa outperforms several strong baselines on attribute relevance and textual quality while maintaining a high inference speed.
arXiv Detail & Related papers (2023-05-22T07:30:35Z) - Controllable Dialogue Generation with Disentangled Multi-grained Style
Specification and Attribute Consistency Reward [47.96949534259019]
We propose a controllable dialogue generation model to steer response generation under multi-attribute constraints.
We categorize the commonly used control attributes into global and local ones, which possess different granularities of effects on response generation.
Our model can significantly outperform competitive baselines in terms of response quality, content diversity and controllability.
arXiv Detail & Related papers (2021-09-14T14:29:38Z) - Progressive Open-Domain Response Generation with Multiple Controllable
Attributes [13.599621571488033]
We propose a Progressively trained Hierarchical Vari-Decoder (PHED) to tackle this task.
PHED deploys Conditional AutoEncoder (CVAE) on Transformer to include one aspect of attributes at one stage.
PHED significantly outperforms the state-of-the-art neural generation models and produces more diverse responses as expected.
arXiv Detail & Related papers (2021-06-07T08:48:39Z) - Transformer-based Conditional Variational Autoencoder for Controllable
Story Generation [39.577220559911055]
We investigate large-scale latent variable models (LVMs) for neural story generation with objectives in two threads: generation effectiveness and controllability.
We advocate to revive latent variable modeling, essentially the power of representation learning, in the era of Transformers.
Specifically, we integrate latent representation vectors with a Transformer-based pre-trained architecture to build conditional variational autoencoder (CVAE)
arXiv Detail & Related papers (2021-01-04T08:31:11Z) - Controllable Text Generation with Focused Variation [71.07811310799664]
Focused-Variation Network (FVN) is a novel model to control language generation.
FVN learns disjoint discrete latent spaces for each attribute inside codebooks, which allows for both controllability and diversity.
We evaluate FVN on two text generation datasets with annotated content and style, and show state-of-the-art performance as assessed by automatic and human evaluations.
arXiv Detail & Related papers (2020-09-25T06:31:06Z) - Control, Generate, Augment: A Scalable Framework for Multi-Attribute
Text Generation [22.70189685469752]
We introduce CGA, a conditional VAE architecture, to control, generate, and augment text.
We show the value of the individual model components in an ablation study.
We show high quality, diversity and attribute control in the generated sentences through a series of automatic and human assessments.
arXiv Detail & Related papers (2020-04-30T17:31:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.