Diffusion Language Models Generation Can Be Halted Early
- URL: http://arxiv.org/abs/2305.10818v4
- Date: Mon, 12 Feb 2024 09:34:39 GMT
- Title: Diffusion Language Models Generation Can Be Halted Early
- Authors: Sofia Maria Lo Cicero Vaina, Nikita Balagansky, Daniil Gavrilov
- Abstract summary: Diffusion Language models (DLMs) are a promising avenue for text generation due to their practical properties on tractable controllable generation.
One of the ways to reduce the performance gap between these two types of language models is to speed up the generation of DLMs.
We propose a novel methodology to address this issue in this work. It enables the execution of more generation steps within a given time frame.
- Score: 4.726777092009553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion Language models (DLMs) are a promising avenue for text generation
due to their practical properties on tractable controllable generation. They
also have the advantage of not having to predict text autoregressively.
However, despite these notable features, DLMs have not yet reached the
performance levels of their autoregressive counterparts. One of the ways to
reduce the performance gap between these two types of language models is to
speed up the generation of DLMs. Therefore, we propose a novel methodology to
address this issue in this work. It enables the execution of more generation
steps within a given time frame, leading to higher-quality outputs.
Specifically, our methods estimate DLMs completeness of text generation and
allow adaptive halting of the generation process. We evaluate our methods on
Plaid, SSD, and CDCD DLMs and create a cohesive perspective on their generation
workflows. Finally, we confirm that our methods allow halting these models and
decrease the generation time by $10$-$40$\% without a drop in the quality of
model samples.
Related papers
- Language Models can Self-Lengthen to Generate Long Texts [74.96074422345806]
This paper introduces an innovative iterative training framework called Self-Lengthen.
It leverages only the intrinsic knowledge and skills of Large Language Models without the need for auxiliary data or proprietary models.
Experiments on benchmarks and human evaluations show that Self-Lengthen outperforms existing methods in long-text generation.
arXiv Detail & Related papers (2024-10-31T13:47:10Z) - DALD: Improving Logits-based Detector without Logits from Black-box LLMs [56.234109491884126]
Large Language Models (LLMs) have revolutionized text generation, producing outputs that closely mimic human writing.
We present Distribution-Aligned LLMs Detection (DALD), an innovative framework that redefines the state-of-the-art performance in black-box text detection.
DALD is designed to align the surrogate model's distribution with that of unknown target LLMs, ensuring enhanced detection capability and resilience against rapid model iterations.
arXiv Detail & Related papers (2024-06-07T19:38:05Z) - Enforcing Paraphrase Generation via Controllable Latent Diffusion [60.82512050963046]
We propose textitLatent textitDiffusion textitParaphraser(LDP), a novel paraphrase generation by modeling a controllable diffusion process.
Experiments show that LDP achieves improved and diverse paraphrase generation compared to baselines.
arXiv Detail & Related papers (2024-04-13T09:24:32Z) - Text-Guided Molecule Generation with Diffusion Language Model [23.170313481324598]
We propose the Text-Guided Molecule Generation with Diffusion Language Model (TGM-DLM)
TGM-DLM updates token embeddings within the SMILES string collectively and iteratively, using a two-phase diffusion generation process.
We demonstrate that TGM-DLM outperforms MolT5-Base, an autoregressive model, without the need for additional data resources.
arXiv Detail & Related papers (2024-02-20T14:29:02Z) - FiLM: Fill-in Language Models for Any-Order Generation [71.42044325886194]
Fill-in Language Model (FiLM) is a new language modeling approach that allows for flexible generation at any position without adhering to a specific generation order.
During inference, FiLM can seamlessly insert missing phrases, sentences, or paragraphs.
FiLM outperforms existing infilling methods that rely on left-to-right language models trained on rearranged text segments.
arXiv Detail & Related papers (2023-10-15T19:37:39Z) - PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model [37.2192243883707]
We propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation to generate fluent text.
Results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text.
arXiv Detail & Related papers (2023-06-05T01:36:39Z) - Extrapolating Multilingual Understanding Models as Multilingual
Generators [82.1355802012414]
This paper explores methods to empower multilingual understanding models the generation abilities to get a unified model.
We propose a textbfSemantic-textbfGuided textbfAlignment-then-Denoising (SGA) approach to adapt an encoder to a multilingual generator with a small number of new parameters.
arXiv Detail & Related papers (2023-05-22T15:33:21Z) - ELMER: A Non-Autoregressive Pre-trained Language Model for Efficient and
Effective Text Generation [97.64625999380425]
We study the text generation task under the approach of pre-trained language models (PLMs)
By leveraging the early exit technique, ELMER enables the token generations at different layers, according to their prediction confidence.
Experiments on three text generation tasks show that ELMER significantly outperforms NAR models.
arXiv Detail & Related papers (2022-10-24T14:46:47Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.