Related papers: Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models

Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models

URL: http://arxiv.org/abs/2311.08472v1
Date: Tue, 14 Nov 2023 19:02:03 GMT
Title: Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models
Authors: Carlos Aguirre, Kuleen Sasse, Isabel Cachola and Mark Dredze
Abstract summary: We explore the effect of shots, which directly affect the performance of models, on the fairness of large language models (LLMs) as NLP classification systems. We consider how different shot selection strategies, both existing and new demographically sensitive methods, affect model fairness across three standard fairness datasets.
Score: 14.772568847965408
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently, work in NLP has shifted to few-shot (in-context) learning, with large language models (LLMs) performing well across a range of tasks. However, while fairness evaluations have become a standard for supervised methods, little is known about the fairness of LLMs as prediction systems. Further, common standard methods for fairness involve access to models weights or are applied during finetuning, which are not applicable in few-shot learning. Do LLMs exhibit prediction biases when used for standard NLP tasks? In this work, we explore the effect of shots, which directly affect the performance of models, on the fairness of LLMs as NLP classification systems. We consider how different shot selection strategies, both existing and new demographically sensitive methods, affect model fairness across three standard fairness datasets. We discuss how future work can include LLM fairness evaluations.

Related papers

Group Fairness Meets the Black Box: Enabling Fair Algorithms on Closed LLMs via Post-Processing [14.622788745587815]
We propose a framework for deriving fair classifiers from closed-weight LLMs via prompting.<n>Our framework is data-efficient and outperforms fair classifiers trained on LLM embeddings.
arXiv Detail & Related papers (2025-08-15T06:50:29Z)
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge [84.34545223897578]
Despite their excellence in many domains, potential issues are under-explored, undermining their reliability and the scope of their utility. We identify 12 key potential biases and propose a new automated bias quantification framework-CALM- which quantifies and analyzes each type of bias in LLM-as-a-Judge. Our work highlights the need for stakeholders to address these issues and remind users to exercise caution in LLM-as-a-Judge applications.
arXiv Detail & Related papers (2024-10-03T17:53:30Z)
A Multi-LLM Debiasing Framework [85.17156744155915]
Large Language Models (LLMs) are powerful tools with the potential to benefit society immensely, yet, they have demonstrated biases that perpetuate societal inequalities. Recent research has shown a growing interest in multi-LLM approaches, which have been demonstrated to be effective in improving the quality of reasoning. We propose a novel multi-LLM debiasing framework aimed at reducing bias in LLMs.
arXiv Detail & Related papers (2024-09-20T20:24:50Z)
Fairness in Large Language Models in Three Hours [2.443957114877221]
This tutorial provides a systematic overview of recent advances in the literature concerning large language models. The concept of fairness in LLMs is then explored, summarizing the strategies for evaluating bias and the algorithms designed to promote fairness.
arXiv Detail & Related papers (2024-08-02T03:44:14Z)
Do Large Language Models Rank Fairly? An Empirical Study on the Fairness of LLMs as Rankers [27.66626125248612]
This paper presents an empirical study evaluating Large Language Models (LLMs) using the TREC Fair Ranking dataset. We focus on the representation of binary protected attributes such as gender and geographic location, which are historically underrepresented in search outcomes. Our analysis delves into how these LLMs handle queries and documents related to these attributes, aiming to uncover biases in their ranking algorithms.
arXiv Detail & Related papers (2024-04-04T04:23:19Z)
Fairness in Large Language Models: A Taxonomic Survey [2.669847575321326]
Large Language Models (LLMs) have demonstrated remarkable success across various domains. Despite their promising performance in numerous real-world applications, most of these algorithms lack fairness considerations.
arXiv Detail & Related papers (2024-03-31T22:22:53Z)
Few-Shot Fairness: Unveiling LLM's Potential for Fairness-Aware Classification [7.696798306913988]
We introduce a framework outlining fairness regulations aligned with various fairness definitions. We explore the configuration for in-context learning and the procedure for selecting in-context demonstrations using RAG. Experiments conducted with different LLMs indicate that GPT-4 delivers superior results in terms of both accuracy and fairness compared to other models.
arXiv Detail & Related papers (2024-02-28T17:29:27Z)
Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
Fair Few-shot Learning with Auxiliary Sets [53.30014767684218]
In many machine learning (ML) tasks, only very few labeled data samples can be collected, which can lead to inferior fairness performance. In this paper, we define the fairness-aware learning task with limited training samples as the emphfair few-shot learning problem. We devise a novel framework that accumulates fairness-aware knowledge across different meta-training tasks and then generalizes the learned knowledge to meta-test tasks.
arXiv Detail & Related papers (2023-08-28T06:31:37Z)
A Survey on Fairness in Large Language Models [28.05516809190299]
Large Language Models (LLMs) have shown powerful performance and development prospects. LLMs can capture social biases from unprocessed training data and propagate the biases to downstream tasks. Unfair LLM systems have undesirable social impacts and potential harms.
arXiv Detail & Related papers (2023-08-20T03:30:22Z)
Non-Invasive Fairness in Learning through the Lens of Data Drift [88.37640805363317]
We show how to improve the fairness of Machine Learning models without altering the data or the learning algorithm. We use a simple but key insight: the divergence of trends between different populations, and, consecutively, between a learned model and minority populations, is analogous to data drift. We explore two strategies (model-splitting and reweighing) to resolve this drift, aiming to improve the overall conformance of models to the underlying data.
arXiv Detail & Related papers (2023-03-30T17:30:42Z)
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning [104.58874584354787]
In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning. This study aims to examine the in-context learning phenomenon through a Bayesian lens, viewing real-world LLMs as latent variable models.
arXiv Detail & Related papers (2023-01-27T18:59:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.