A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
- URL: http://arxiv.org/abs/2508.08712v2
- Date: Wed, 13 Aug 2025 13:24:25 GMT
- Title: A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
- Authors: Lingzhe Zhang, Liancheng Fang, Chiming Duan, Minghua He, Leyi Pan, Pei Xiao, Shiyu Huang, Yunpeng Zhai, Xuming Hu, Philip S. Yu, Aiwei Liu,
- Abstract summary: parallel text generation techniques aimed at breaking the token-by-token generation bottleneck and improving inference efficiency.<n>We categorize existing approaches into AR-based and Non-AR-based paradigms, and provide a detailed examination of the core techniques within each category.<n>We highlight recent advancements, identify open challenges, and outline promising directions for future research in parallel text generation.
- Score: 49.97547209846335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As text generation has become a core capability of modern Large Language Models (LLMs), it underpins a wide range of downstream applications. However, most existing LLMs rely on autoregressive (AR) generation, producing one token at a time based on previously generated context-resulting in limited generation speed due to the inherently sequential nature of the process. To address this challenge, an increasing number of researchers have begun exploring parallel text generation-a broad class of techniques aimed at breaking the token-by-token generation bottleneck and improving inference efficiency. Despite growing interest, there remains a lack of comprehensive analysis on what specific techniques constitute parallel text generation and how they improve inference performance. To bridge this gap, we present a systematic survey of parallel text generation methods. We categorize existing approaches into AR-based and Non-AR-based paradigms, and provide a detailed examination of the core techniques within each category. Following this taxonomy, we assess their theoretical trade-offs in terms of speed, quality, and efficiency, and examine their potential for combination and comparison with alternative acceleration strategies. Finally, based on our findings, we highlight recent advancements, identify open challenges, and outline promising directions for future research in parallel text generation. We have also created a GitHub repository for indexing relevant papers and open resources available at https://github.com/zhanglingzhe0820/Awesome-Parallel-Text-Generation.
Related papers
- RouteRAG: Efficient Retrieval-Augmented Generation from Text and Graph via Reinforcement Learning [69.87510139069218]
Retrieval-Augmented Generation (RAG) integrates non-parametric knowledge into Large Language Models (LLMs)<n>Recent progress has advanced text-based RAG to multi-turn reasoning through Reinforcement Learning (RL)<n>We introduce model, an RL-based framework that enables LLMs to perform multi-turn and adaptive graph-text hybrid RAG.
arXiv Detail & Related papers (2025-12-10T10:05:31Z) - LSR-MCTS: Alleviating Long Range Dependency in Code Generation [42.10272627826627]
Large language models (LLMs) have significantly promoted the development of code generation task.<n>We propose the textbfLSR-MCTS algorithm, which leverages MCTS to determine the code line-by-line and select the optimal path.
arXiv Detail & Related papers (2025-04-10T04:03:25Z) - Towards Better Open-Ended Text Generation: A Multicriteria Evaluation Framework [0.1979158763744267]
Open-ended text generation has become a prominent task in natural language processing.<n> evaluating the quality of these models and the employed decoding strategies remains challenging.<n>This paper proposes novel methods for both relative and absolute rankings of decoding methods.
arXiv Detail & Related papers (2024-10-24T11:32:01Z) - A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models [2.091322528026356]
This paper presents a survey of accelerated generation techniques in autoregressive language models.
We categorize these techniques into several key areas: speculative decoding, early exiting mechanisms, and non-autoregressive methods.
arXiv Detail & Related papers (2024-05-15T07:36:56Z) - Evaluating, Understanding, and Improving Constrained Text Generation for Large Language Models [49.74036826946397]
This study investigates constrained text generation for large language models (LLMs)
Our research mainly focuses on mainstream open-source LLMs, categorizing constraints into lexical, structural, and relation-based types.
Results illuminate LLMs' capacity and deficiency to incorporate constraints and provide insights for future developments in constrained text generation.
arXiv Detail & Related papers (2023-10-25T03:58:49Z) - Learning to Rank in Generative Retrieval [62.91492903161522]
Generative retrieval aims to generate identifier strings of relevant passages as the retrieval target.
We propose a learning-to-rank framework for generative retrieval, dubbed LTRGR.
This framework only requires an additional learning-to-rank training phase to enhance current generative retrieval systems.
arXiv Detail & Related papers (2023-06-27T05:48:14Z) - $\textit{latent}$-GLAT: Glancing at Latent Variables for Parallel Text
Generation [65.29170569821093]
parallel text generation has received widespread attention due to its success in generation efficiency.
In this paper, we propose $textitlatent$-GLAT, which employs the discrete latent variables to capture word categorical information.
Experiment results show that our method outperforms strong baselines without the help of an autoregressive model.
arXiv Detail & Related papers (2022-04-05T07:34:12Z) - A Survey on Retrieval-Augmented Text Generation [53.04991859796971]
Retrieval-augmented text generation has remarkable advantages and has achieved state-of-the-art performance in many NLP tasks.
It firstly highlights the generic paradigm of retrieval-augmented generation, and then it reviews notable approaches according to different tasks.
arXiv Detail & Related papers (2022-02-02T16:18:41Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.