An Analysis of the Effects of Decoding Algorithms on Fairness in
Open-Ended Language Generation
- URL: http://arxiv.org/abs/2210.03826v1
- Date: Fri, 7 Oct 2022 21:33:34 GMT
- Title: An Analysis of the Effects of Decoding Algorithms on Fairness in
Open-Ended Language Generation
- Authors: Jwala Dhamala, Varun Kumar, Rahul Gupta, Kai-Wei Chang, Aram Galstyan
- Abstract summary: We present a systematic analysis of the impact of decoding algorithms on LM fairness.
We analyze the trade-off between fairness, diversity and quality.
- Score: 77.44921096644698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Several prior works have shown that language models (LMs) can generate text
containing harmful social biases and stereotypes. While decoding algorithms
play a central role in determining properties of LM generated text, their
impact on the fairness of the generations has not been studied. We present a
systematic analysis of the impact of decoding algorithms on LM fairness, and
analyze the trade-off between fairness, diversity and quality. Our experiments
with top-$p$, top-$k$ and temperature decoding algorithms, in open-ended
language generation, show that fairness across demographic groups changes
significantly with change in decoding algorithm's hyper-parameters. Notably,
decoding algorithms that output more diverse text also output more texts with
negative sentiment and regard. We present several findings and provide
recommendations on standardized reporting of decoding details in fairness
evaluations and optimization of decoding algorithms for fairness alongside
quality and diversity.
Related papers
- A Theoretical Perspective for Speculative Decoding Algorithm [60.79447486066416]
One effective way to accelerate inference is emphSpeculative Decoding, which employs a small model to sample a sequence of draft tokens and a large model to validate.
This paper tackles this gap by conceptualizing the decoding problem via markov chain abstraction and studying the key properties, emphoutput quality and inference acceleration, from a theoretical perspective.
arXiv Detail & Related papers (2024-10-30T01:53:04Z) - Decoding Decoded: Understanding Hyperparameter Effects in Open-Ended Text Generation [0.22499166814992438]
Decoding strategies for large language models (LLMs) are a critical but often underexplored aspect of text generation tasks.
We present a large-scale, comprehensive analysis of how hyper parameter selection affects text quality in open-ended text generation.
arXiv Detail & Related papers (2024-10-08T14:51:03Z) - Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation [0.20971479389679337]
We introduce adaptive contrastive search, a novel decoding strategy extending contrastive search.
Our findings indicate performance enhancement in both aspects, across different model architectures and datasets.
arXiv Detail & Related papers (2024-07-26T12:23:54Z) - ToBlend: Token-Level Blending With an Ensemble of LLMs to Attack AI-Generated Text Detection [6.27025292177391]
ToBlend is a novel token-level ensemble text generation method to challenge the robustness of current AI-content detection approaches.
We find ToBlend significantly drops the performance of most mainstream AI-content detection methods.
arXiv Detail & Related papers (2024-02-17T02:25:57Z) - A Thorough Examination of Decoding Methods in the Era of LLMs [72.65956436513241]
Decoding methods play an indispensable role in converting language models from next-token predictors into practical task solvers.
This paper provides a comprehensive and multifaceted analysis of various decoding methods within the context of large language models.
Our findings reveal that decoding method performance is notably task-dependent and influenced by factors such as alignment, model size, and quantization.
arXiv Detail & Related papers (2024-02-10T11:14:53Z) - Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding [4.209844101827474]
We develop diversity-promoting decoding algorithms by enforcing diversity objectives to Minimum Bayes-Risk decoding.
We evaluate DMBR and KMBR on a variety of directed text generation tasks using encoder-decoder models and a large language model with prompting.
arXiv Detail & Related papers (2024-01-10T10:23:41Z) - Contrastive Decoding Improves Reasoning in Large Language Models [55.16503283583076]
We show that Contrastive Decoding achieves large out-of-the-box improvements over greedy decoding on a variety of reasoning tasks.
We show that Contrastive Decoding leads LLaMA-65B to outperform LLaMA 2, GPT-3.5 and PaLM 2-L on the HellaSwag commonsense reasoning benchmark.
arXiv Detail & Related papers (2023-09-17T00:29:32Z) - Surfacing Biases in Large Language Models using Contrastive Input
Decoding [12.694066526722203]
Contrastive Input Decoding (CID) is a decoding algorithm to generate text given two inputs.
We use CID to highlight context-specific biases that are hard to detect with standard decoding strategies.
arXiv Detail & Related papers (2023-05-12T11:09:49Z) - Language Model Decoding as Likelihood-Utility Alignment [54.70547032876017]
We introduce a taxonomy that groups decoding strategies based on their implicit assumptions about how well the model's likelihood is aligned with the task-specific notion of utility.
Specifically, by analyzing the correlation between the likelihood and the utility of predictions across a diverse set of tasks, we provide the first empirical evidence supporting the proposed taxonomy.
arXiv Detail & Related papers (2022-10-13T17:55:51Z) - On Decoding Strategies for Neural Text Generators [73.48162198041884]
We study the interaction between language generation tasks and decoding strategies.
We measure changes in attributes of generated text as a function of both decoding strategy and task.
Our results reveal both previously-observed and surprising findings.
arXiv Detail & Related papers (2022-03-29T16:25:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.