DiffusionBERT: Improving Generative Masked Language Models with
Diffusion Models
- URL: http://arxiv.org/abs/2211.15029v2
- Date: Wed, 30 Nov 2022 15:41:24 GMT
- Title: DiffusionBERT: Improving Generative Masked Language Models with
Diffusion Models
- Authors: Zhengfu He, Tianxiang Sun, Kuanning Wang, Xuanjing Huang, Xipeng Qiu
- Abstract summary: DiffusionBERT is a new generative masked language model based on discrete diffusion models.
We propose a new noise schedule for the forward diffusion process that controls the degree of noise added at each step.
Experiments on unconditional text generation demonstrate that DiffusionBERT achieves significant improvement over existing diffusion models for text.
- Score: 81.84866217721361
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present DiffusionBERT, a new generative masked language model based on
discrete diffusion models. Diffusion models and many pre-trained language
models have a shared training objective, i.e., denoising, making it possible to
combine the two powerful models and enjoy the best of both worlds. On the one
hand, diffusion models offer a promising training strategy that helps improve
the generation quality. On the other hand, pre-trained denoising language
models (e.g., BERT) can be used as a good initialization that accelerates
convergence. We explore training BERT to learn the reverse process of a
discrete diffusion process with an absorbing state and elucidate several
designs to improve it. First, we propose a new noise schedule for the forward
diffusion process that controls the degree of noise added at each step based on
the information of each token. Second, we investigate several designs of
incorporating the time step into BERT. Experiments on unconditional text
generation demonstrate that DiffusionBERT achieves significant improvement over
existing diffusion models for text (e.g., D3PM and Diffusion-LM) and previous
generative masked language models in terms of perplexity and BLEU score.
Related papers
- Scaling Diffusion Language Models via Adaptation from Autoregressive Models [105.70889434492143]
Diffusion Language Models (DLMs) have emerged as a promising new paradigm for text generative modeling.
We show that we can convert AR models ranging from 127M to 7B parameters into diffusion models DiffuGPT and DiffuLLaMA, using less than 200B tokens for training.
Our experimental results reveal that these models outperform earlier DLMs and are competitive with their AR counterparts.
arXiv Detail & Related papers (2024-10-23T14:04:22Z) - Simple and Effective Masked Diffusion Language Models [48.68198363304619]
We show that simple masked discrete diffusion is more performant than previously thought.
We apply an effective training recipe that improves the performance of masked diffusion models.
Our objective has a simple form -- it is a mixture of classical masked language modeling losses.
arXiv Detail & Related papers (2024-06-11T17:51:40Z) - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI)
In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion)
Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z) - Likelihood-Based Diffusion Language Models [13.916640262862215]
We take the first steps towards closing the likelihood gap between autoregressive and diffusion-based language models.
We pursue this goal through algorithmic improvements, scaling laws, and increased compute.
We release Plaid 1B, a large diffusion language model which outperforms GPT-2 124M in likelihood on benchmark datasets.
arXiv Detail & Related papers (2023-05-30T16:43:31Z) - LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image
Diffusion Models with Large Language Models [62.75006608940132]
This work proposes to enhance prompt understanding capabilities in text-to-image diffusion models.
Our method leverages a pretrained large language model for grounded generation in a novel two-stage process.
Our method significantly outperforms the base diffusion model and several strong baselines in accurately generating images.
arXiv Detail & Related papers (2023-05-23T03:59:06Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - MagicFusion: Boosting Text-to-Image Generation Performance by Fusing
Diffusion Models [20.62953292593076]
We propose a simple yet effective method called Saliency-aware Noise Blending (SNB) that can empower the fused text-guided diffusion models to achieve more controllable generation.
SNB is training-free and can be completed within a DDIM sampling process. Additionally, it can automatically align the semantics of two noise spaces without requiring additional annotations such as masks.
arXiv Detail & Related papers (2023-03-23T09:30:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.