Related papers: Societal Biases in Language Generation: Progress and Challenges

Societal Biases in Language Generation: Progress and Challenges

URL: http://arxiv.org/abs/2105.04054v1
Date: Mon, 10 May 2021 00:17:33 GMT
Title: Societal Biases in Language Generation: Progress and Challenges
Authors: Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng
Abstract summary: Language generation presents unique challenges in terms of direct user interaction and the structure of decoding techniques. We present a survey on societal biases in language generation, focusing on how techniques contribute to biases and on progress towards bias analysis and mitigation. Motivated by a lack of studies on biases from decoding techniques, we also conduct experiments to quantify the effects of these techniques.
Score: 43.06301135908934
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Technology for language generation has advanced rapidly, spurred by advancements in pre-training large models on massive amounts of data and the need for intelligent agents to communicate in a natural manner. While techniques can effectively generate fluent text, they can also produce undesirable societal biases that can have a disproportionately negative impact on marginalized populations. Language generation presents unique challenges in terms of direct user interaction and the structure of decoding techniques. To better understand these challenges, we present a survey on societal biases in language generation, focusing on how techniques contribute to biases and on progress towards bias analysis and mitigation. Motivated by a lack of studies on biases from decoding techniques, we also conduct experiments to quantify the effects of these techniques. By further discussing general trends and open challenges, we call to attention promising directions for research and the importance of fairness and inclusivity considerations for language generation applications.

Related papers

Misspellings in Natural Language Processing: A survey [52.419589623702336]
misspellings have become ubiquitous in digital communication. We reconstruct a history of misspellings as a scientific problem. We discuss the latest advancements to address the challenge of misspellings in NLP.
arXiv Detail & Related papers (2025-01-28T10:26:04Z)
Generative AI and Large Language Models in Language Preservation: Opportunities and Challenges [0.0]
Generative AI and large-scale language models (LLM) have emerged as powerful tools in language preservation. This paper examines the role of generative AIs and LLMs in preserving endangered languages, highlighting the risks and challenges associated with their use.
arXiv Detail & Related papers (2025-01-20T14:03:40Z)
Developmental Predictive Coding Model for Early Infancy Mono and Bilingual Vocal Continual Learning [69.8008228833895]
We propose a small-sized generative neural network equipped with a continual learning mechanism. Our model prioritizes interpretability and demonstrates the advantages of online learning.
arXiv Detail & Related papers (2024-12-23T10:23:47Z)
Detection of Machine-Generated Text: Literature Survey [0.0]
This literature survey aims to compile and synthesize accomplishments and developments in the field of machine-generated text. It also gives an overview of machine-generated text trends and explores the larger societal implications.
arXiv Detail & Related papers (2024-01-02T01:44:15Z)
Developing Linguistic Patterns to Mitigate Inherent Human Bias in Offensive Language Detection [1.6574413179773761]
We propose a linguistic data augmentation approach to reduce bias in labeling processes. This approach has the potential to improve offensive language classification tasks across multiple languages.
arXiv Detail & Related papers (2023-12-04T10:20:36Z)
Factuality Challenges in the Era of Large Language Models [113.3282633305118]
Large Language Models (LLMs) generate false, erroneous, or misleading content. LLMs can be exploited for malicious applications. This poses a significant challenge to society in terms of the potential deception of users.
arXiv Detail & Related papers (2023-10-08T14:55:02Z)
Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs) We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing. We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z)
On the application of Large Language Models for language teaching and assessment technology [18.735612275207853]
We look at the potential for incorporating large language models in AI-driven language teaching and assessment systems. We find that larger language models offer improvements over previous models in text generation. For automated grading and grammatical error correction, tasks whose progress is checked on well-known benchmarks, early investigations indicate that large language models on their own do not improve on state-of-the-art results.
arXiv Detail & Related papers (2023-07-17T11:12:56Z)
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models [11.323961700172175]
This article investigates the challenges and risks associated with biases in large-scale language models like ChatGPT. We discuss the origins of biases, stemming from, among others, the nature of training data, model specifications, algorithmic constraints, product design, and policy decisions. We review the current approaches to identify, quantify, and mitigate biases in language models, emphasizing the need for a multi-disciplinary, collaborative effort to develop more equitable, transparent, and responsible AI systems.
arXiv Detail & Related papers (2023-04-07T17:14:00Z)
Why is constrained neural language generation particularly challenging? [13.62873478165553]
We present an extensive survey on the emerging topic of constrained neural language generation. We distinguish between conditions and constraints, present constrained text generation tasks, and review existing methods and evaluation metrics for constrained text generation. Our aim is to highlight recent progress and trends in this emerging field, informing on the most promising directions and limitations towards advancing the state-of-the-art of constrained neural language generation research.
arXiv Detail & Related papers (2022-06-11T02:07:33Z)
Causal Reasoning Meets Visual Representation Learning: A Prospective Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models. Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms. This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z)
Survey of Hallucination in Natural Language Generation [69.9926849848132]
Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies. Deep learning based generation is prone to hallucinate unintended text, which degrades the system performance. This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.
arXiv Detail & Related papers (2022-02-08T03:55:01Z)
Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases. We propose steps towards mitigating social biases during text generation. Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.