Related papers: Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges

Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges

URL: http://arxiv.org/abs/2405.15604v3
Date: Thu, 29 Aug 2024 20:05:27 GMT
Title: Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges
Authors: Jonas Becker, Jan Philip Wahle, Bela Gipp, Terry Ruas,
Abstract summary: This review categorizes works in text generation into five main tasks. For each task, we review their relevant characteristics, sub-tasks, and specific challenges. Our investigation shows nine prominent challenges common to all tasks and sub-tasks in recent text generation publications.
Score: 7.140449861888235
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Text generation has become more accessible than ever, and the increasing interest in these systems, especially those using large language models, has spurred an increasing number of related publications. We provide a systematic literature review comprising 244 selected papers between 2017 and 2024. This review categorizes works in text generation into five main tasks: open-ended text generation, summarization, translation, paraphrasing, and question answering. For each task, we review their relevant characteristics, sub-tasks, and specific challenges (e.g., missing datasets for multi-document summarization, coherence in story generation, and complex reasoning for question answering). Additionally, we assess current approaches for evaluating text generation systems and ascertain problems with current metrics. Our investigation shows nine prominent challenges common to all tasks and sub-tasks in recent text generation publications: bias, reasoning, hallucinations, misuse, privacy, interpretability, transparency, datasets, and computing. We provide a detailed analysis of these challenges, their potential solutions, and which gaps still require further engagement from the community. This systematic literature review targets two main audiences: early career researchers in natural language processing looking for an overview of the field and promising research directions, as well as experienced researchers seeking a detailed view of tasks, evaluation methodologies, open challenges, and recent mitigation strategies.

Related papers

Summarizing Speech: A Comprehensive Survey [76.13011304983458]
Speech summarization has become an essential tool for efficiently managing and accessing the growing volume of spoken and audiovisual content.<n>This survey examines existing datasets and evaluation protocols, which are crucial for assessing the quality of summarization approaches.
arXiv Detail & Related papers (2025-04-10T17:50:53Z)
RAPID: Efficient Retrieval-Augmented Long Text Generation with Writing Planning and Information Discovery [69.41989381702858]
Existing methods, such as direct generation and multi-agent discussion, often struggle with issues like hallucinations, topic incoherence, and significant latency. We propose RAPID, an efficient retrieval-augmented long text generation framework. Our work provides a robust and efficient solution to the challenges of automated long-text generation.
arXiv Detail & Related papers (2025-03-02T06:11:29Z)
Abstractive Text Summarization: State of the Art, Challenges, and Improvements [6.349503549199403]
This review takes a comprehensive approach encompassing state-of-the-art methods, challenges, solutions, comparisons, limitations and charts out future improvements. The paper highlights challenges such as inadequate meaning representation, factual consistency, controllable text summarization, cross-lingual summarization, and evaluation metrics.
arXiv Detail & Related papers (2024-09-04T03:39:23Z)
What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation [57.550045763103334]
evaluating a story can be more challenging than other generation evaluation tasks. We first summarize existing storytelling tasks, including text-to-text, visual-to-text, and text-to-visual. We propose a taxonomy to organize evaluation metrics that have been developed or can be adopted for story evaluation.
arXiv Detail & Related papers (2024-08-26T20:35:42Z)
CADS: A Systematic Literature Review on the Challenges of Abstractive Dialogue Summarization [7.234196390284036]
This article summarizes the research on Transformer-based abstractive summarization for English dialogues. We cover the main challenges present in dialog summarization (i.e., language, structure, comprehension, speaker, salience, and factuality) We find that while some challenges, like language, have seen considerable progress, others, such as comprehension, factuality, and salience, remain difficult and hold significant research opportunities.
arXiv Detail & Related papers (2024-06-11T17:30:22Z)
Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey [17.19337964440007]
There is currently a lack of comprehensive review that summarizes and compares the key techniques, metrics, datasets, models, and optimization approaches in this research domain. This survey aims to address this gap by consolidating recent progress in these areas, offering a thorough survey and taxonomy of the datasets, metrics, and methodologies utilized. It identifies strengths, limitations, unexplored territories, and gaps in the existing literature, while providing some insights for future research directions in this vital and rapidly evolving field.
arXiv Detail & Related papers (2024-02-27T23:59:01Z)
A Systematic Review of Data-to-Text NLG [2.4769539696439677]
Methods for producing high-quality text are explored, addressing the challenge of hallucinations in data-to-text generation. Despite advancements in text quality, the review emphasizes the importance of research in low-resourced languages.
arXiv Detail & Related papers (2024-02-13T14:51:45Z)
Summarization with Graphical Elements [55.5913491389047]
We propose a new task: summarization with graphical elements. We collect a high quality human labeled dataset to support research into the task.
arXiv Detail & Related papers (2022-04-15T17:16:41Z)
Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods [48.47413103662829]
Natural Language Generation (NLG) has made great progress in recent years due to the development of deep learning techniques such as pre-trained language models. However, the faithfulness problem that the generated text usually contains unfaithful or non-factual information has become the biggest challenge.
arXiv Detail & Related papers (2022-03-10T08:28:32Z)
A Survey on Retrieval-Augmented Text Generation [53.04991859796971]
Retrieval-augmented text generation has remarkable advantages and has achieved state-of-the-art performance in many NLP tasks. It firstly highlights the generic paradigm of retrieval-augmented generation, and then it reviews notable approaches according to different tasks.
arXiv Detail & Related papers (2022-02-02T16:18:41Z)
Positioning yourself in the maze of Neural Text Generation: A Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc. We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z)
From Standard Summarization to New Tasks and Beyond: Summarization with Manifold Information [77.89755281215079]
Text summarization is the research area aiming at creating a short and condensed version of the original document. In real-world applications, most of the data is not in a plain text format. This paper focuses on the survey of these new summarization tasks and approaches in the real-world application.
arXiv Detail & Related papers (2020-05-10T14:59:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.