A Comprehensive Survey of Natural Language Generation Advances from the
Perspective of Digital Deception
- URL: http://arxiv.org/abs/2208.05757v1
- Date: Thu, 11 Aug 2022 11:27:38 GMT
- Title: A Comprehensive Survey of Natural Language Generation Advances from the
Perspective of Digital Deception
- Authors: Keenan Jones, Enes Altuncu, Virginia N. L. Franqueira, Yichao Wang and
Shujun Li
- Abstract summary: We provide an overview of the field of natural language generators (NLG)
We outline a proposed high-level taxonomy of the central concepts that constitute NLG.
We discuss the broader challenges of NLG, including the risks of bias that are often exhibited by existing text generation systems.
- Score: 1.557442325082254
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In recent years there has been substantial growth in the capabilities of
systems designed to generate text that mimics the fluency and coherence of
human language. From this, there has been considerable research aimed at
examining the potential uses of these natural language generators (NLG) towards
a wide number of tasks. The increasing capabilities of powerful text generators
to mimic human writing convincingly raises the potential for deception and
other forms of dangerous misuse. As these systems improve, and it becomes ever
harder to distinguish between human-written and machine-generated text,
malicious actors could leverage these powerful NLG systems to a wide variety of
ends, including the creation of fake news and misinformation, the generation of
fake online product reviews, or via chatbots as means of convincing users to
divulge private information. In this paper, we provide an overview of the NLG
field via the identification and examination of 119 survey-like papers focused
on NLG research. From these identified papers, we outline a proposed high-level
taxonomy of the central concepts that constitute NLG, including the methods
used to develop generalised NLG systems, the means by which these systems are
evaluated, and the popular NLG tasks and subtasks that exist. In turn, we
provide an overview and discussion of each of these items with respect to
current research and offer an examination of the potential roles of NLG in
deception and detection systems to counteract these threats. Moreover, we
discuss the broader challenges of NLG, including the risks of bias that are
often exhibited by existing text generation systems. This work offers a broad
overview of the field of NLG with respect to its potential for misuse, aiming
to provide a high-level understanding of this rapidly developing area of
research.
Related papers
- A Survey of AI-generated Text Forensic Systems: Detection, Attribution,
and Characterization [13.44566185792894]
AI-generated text forensics is an emerging field addressing the challenges of LLM misuses.
We introduce a detailed taxonomy, focusing on three primary pillars: detection, attribution, and characterization.
We explore available resources for AI-generated text forensics research and discuss the evolving challenges and future directions of forensic systems in an AI era.
arXiv Detail & Related papers (2024-03-02T09:39:13Z) - Leveraging Large Language Models for NLG Evaluation: Advances and Challenges [57.88520765782177]
Large Language Models (LLMs) have opened new avenues for assessing generated content quality, e.g., coherence, creativity, and context relevance.
We propose a coherent taxonomy for organizing existing LLM-based evaluation metrics, offering a structured framework to understand and compare these methods.
By discussing unresolved challenges, including bias, robustness, domain-specificity, and unified evaluation, this paper seeks to offer insights to researchers and advocate for fairer and more advanced NLG evaluation techniques.
arXiv Detail & Related papers (2024-01-13T15:59:09Z) - Towards Possibilities & Impossibilities of AI-generated Text Detection:
A Survey [97.33926242130732]
Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses.
Despite these advancements, several works in the existing literature have raised serious concerns about the potential misuse of LLMs.
To address these concerns, a consensus among the research community is to develop algorithmic solutions to detect AI-generated text.
arXiv Detail & Related papers (2023-10-23T18:11:32Z) - Large Language Models for Information Retrieval: A Survey [58.30439850203101]
Information retrieval has evolved from term-based methods to its integration with advanced neural models.
Recent research has sought to leverage large language models (LLMs) to improve IR systems.
We delve into the confluence of LLMs and IR systems, including crucial aspects such as query rewriters, retrievers, rerankers, and readers.
arXiv Detail & Related papers (2023-08-14T12:47:22Z) - Machine Generated Text: A Comprehensive Survey of Threat Models and
Detection Methods [6.978441815839558]
This survey places machine generated text within its cybersecurity and social context.
It includes an analysis of threat models posed by contemporary NLG systems.
It provides guidance for future work addressing the most critical threat models.
arXiv Detail & Related papers (2022-10-13T19:46:14Z) - Innovations in Neural Data-to-text Generation: A Survey [10.225452376884233]
This survey offers a consolidated view into the neural DTG paradigm with a structured examination of the approaches, benchmark datasets, and evaluation protocols.
We highlight promising avenues for DTG research that not only focus on the design of linguistically capable systems but also systems that exhibit fairness and accountability.
arXiv Detail & Related papers (2022-07-25T23:21:48Z) - Recent Advances in Neural Text Generation: A Task-Agnostic Survey [20.932460734129585]
This paper offers a comprehensive and task-agnostic survey of the recent advancements in neural text generation.
We categorize these advancements into four key areas: data construction, neural frameworks, training and inference strategies, and evaluation metrics.
We explore the future directions for the advancement of neural text generation, which encompass the utilization of neural pipelines and the incorporation of background knowledge.
arXiv Detail & Related papers (2022-03-06T20:47:49Z) - Survey of Hallucination in Natural Language Generation [69.9926849848132]
Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies.
Deep learning based generation is prone to hallucinate unintended text, which degrades the system performance.
This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.
arXiv Detail & Related papers (2022-02-08T03:55:01Z) - A Survey of Natural Language Generation [30.134226859027642]
This paper offers a comprehensive review of the research on Natural Language Generation (NLG) over the past two decades.
It focuses on data-to-text generation and text-to-text generation deep learning methods, as well as new applications of NLG technology.
arXiv Detail & Related papers (2021-12-22T09:08:00Z) - A Survey of Knowledge-Enhanced Text Generation [81.24633231919137]
The goal of text generation is to make machines express in human language.
Various neural encoder-decoder models have been proposed to achieve the goal by learning to map input text to output text.
To address this issue, researchers have considered incorporating various forms of knowledge beyond the input text into the generation models.
arXiv Detail & Related papers (2020-10-09T06:46:46Z) - Evaluation of Text Generation: A Survey [107.62760642328455]
The paper surveys evaluation methods of natural language generation systems that have been developed in the last few years.
We group NLG evaluation methods into three categories: (1) human-centric evaluation metrics, (2) automatic metrics that require no training, and (3) machine-learned metrics.
arXiv Detail & Related papers (2020-06-26T04:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.