Fine-grained Text Style Transfer with Diffusion-Based Language Models
- URL: http://arxiv.org/abs/2305.19512v2
- Date: Mon, 12 Jun 2023 02:13:16 GMT
- Title: Fine-grained Text Style Transfer with Diffusion-Based Language Models
- Authors: Yiwei Lyu, Tiange Luo, Jiacheng Shi, Todd C. Hollon, Honglak Lee
- Abstract summary: We trained a diffusion-based model on StylePTB dataset, the standard benchmark for fine-grained text style transfers.
Our model was able to achieve state-of-the-art performance on both individual and compositional transfers.
- Score: 50.02698074338317
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion probabilistic models have shown great success in generating
high-quality images controllably, and researchers have tried to utilize this
controllability into text generation domain. Previous works on diffusion-based
language models have shown that they can be trained without external knowledge
(such as pre-trained weights) and still achieve stable performance and
controllability. In this paper, we trained a diffusion-based model on StylePTB
dataset, the standard benchmark for fine-grained text style transfers. The
tasks in StylePTB requires much more refined control over the output text
compared to tasks evaluated in previous works, and our model was able to
achieve state-of-the-art performance on StylePTB on both individual and
compositional transfers. Moreover, our model, trained on limited data from
StylePTB without external knowledge, outperforms previous works that utilized
pretrained weights, embeddings, and external grammar parsers, and this may
indicate that diffusion-based language models have great potential under
low-resource settings.
Related papers
- ARTIST: Improving the Generation of Text-rich Images with Disentangled Diffusion Models [52.23899502520261]
We introduce a new framework named ARTIST to focus on the learning of text structures.
We finetune a visual diffusion model, enabling it to assimilate textual structure information from the pretrained textual model.
Empirical results on the MARIO-Eval benchmark underscore the effectiveness of the proposed method, showing an improvement of up to 15% in various metrics.
arXiv Detail & Related papers (2024-06-17T19:31:24Z) - UniDiff: Advancing Vision-Language Models with Generative and
Discriminative Learning [86.91893533388628]
This paper presents UniDiff, a unified multi-modal model that integrates image-text contrastive learning (ITC), text-conditioned image synthesis learning (IS), and reciprocal semantic consistency modeling (RSC)
UniDiff demonstrates versatility in both multi-modal understanding and generative tasks.
arXiv Detail & Related papers (2023-06-01T15:39:38Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Self-conditioned Embedding Diffusion for Text Generation [28.342735885752493]
Self-conditioned Embedding Diffusion is a continuous diffusion mechanism that operates on token embeddings.
We show that our text diffusion models generate samples comparable with those produced by standard autoregressive language models.
arXiv Detail & Related papers (2022-11-08T13:30:27Z) - Few-shot Text Classification with Dual Contrastive Consistency [31.141350717029358]
In this paper, we explore how to utilize pre-trained language model to perform few-shot text classification.
We adopt supervised contrastive learning on few labeled data and consistency-regularization on vast unlabeled data.
arXiv Detail & Related papers (2022-09-29T19:26:23Z) - Non-Parallel Text Style Transfer with Self-Parallel Supervision [19.441780035577352]
We propose LaMer, a novel text style transfer framework based on large-scale language models.
LaMer first mines the roughly parallel expressions in the non-parallel datasets with scene graphs, and then employs MLE training, followed by imitation learning refinement, to leverage the intrinsic parallelism within the data.
On two benchmark tasks (sentiment & formality transfer) and a newly proposed challenging task (political stance transfer), our model achieves qualitative advances in transfer accuracy, content preservation, and fluency.
arXiv Detail & Related papers (2022-04-18T01:38:35Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Semi-supervised Formality Style Transfer using Language Model
Discriminator and Mutual Information Maximization [52.867459839641526]
Formality style transfer is the task of converting informal sentences to grammatically-correct formal sentences.
We propose a semi-supervised formality style transfer model that utilizes a language model-based discriminator to maximize the likelihood of the output sentence being formal.
Experiments showed that our model outperformed previous state-of-the-art baselines significantly in terms of both automated metrics and human judgement.
arXiv Detail & Related papers (2020-10-10T21:05:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.