Improving LLM Group Fairness on Tabular Data via In-Context Learning
- URL: http://arxiv.org/abs/2412.04642v1
- Date: Thu, 05 Dec 2024 22:23:30 GMT
- Title: Improving LLM Group Fairness on Tabular Data via In-Context Learning
- Authors: Valeriia Cherepanova, Chia-Jung Lee, Nil-Jana Akpinar, Riccardo Fogliato, Martin Andres Bertran, Michael Kearns, James Zou,
- Abstract summary: Large language models (LLMs) can fail to generate predictions that satisfy group fairness, that is, produce equitable outcomes across groups.<n>In this work, we investigate four empirical approaches to improve group fairness.<n>We show the effectiveness of these methods in enhancing demographic parity while maintaining high overall performance.
- Score: 23.53624663038328
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have been shown to be effective on tabular prediction tasks in the low-data regime, leveraging their internal knowledge and ability to learn from instructions and examples. However, LLMs can fail to generate predictions that satisfy group fairness, that is, produce equitable outcomes across groups. Critically, conventional debiasing approaches for natural language tasks do not directly translate to mitigating group unfairness in tabular settings. In this work, we systematically investigate four empirical approaches to improve group fairness of LLM predictions on tabular datasets, including fair prompt optimization, soft prompt tuning, strategic selection of few-shot examples, and self-refining predictions via chain-of-thought reasoning. Through experiments on four tabular datasets using both open-source and proprietary LLMs, we show the effectiveness of these methods in enhancing demographic parity while maintaining high overall performance. Our analysis provides actionable insights for practitioners in selecting the most suitable approach based on their specific requirements and constraints.
Related papers
- Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models [40.853803921563596]
Current large language models (LLMs) may still capture dataset biases and utilize them during inference.
We propose an information gain-guided causal intervention debiasing framework.
IGCIDB can effectively debias LLM to improve its generalizability across different tasks.
arXiv Detail & Related papers (2025-04-17T12:39:25Z) - Training Large Recommendation Models via Graph-Language Token Alignment [53.3142545812349]
We propose a novel framework to train Large Recommendation models via Graph-Language Token Alignment.
By aligning item and user nodes from the interaction graph with pretrained LLM tokens, GLTA effectively leverages the reasoning abilities of LLMs.
Furthermore, we introduce Graph-Language Logits Matching (GLLM) to optimize token alignment for end-to-end item prediction.
arXiv Detail & Related papers (2025-02-26T02:19:10Z) - Latent Factor Models Meets Instructions: Goal-conditioned Latent Factor Discovery without Task Supervision [50.45597801390757]
Instruct-LF is a goal-oriented latent factor discovery system.
It integrates instruction-following ability with statistical models to handle noisy datasets.
arXiv Detail & Related papers (2025-02-21T02:03:08Z) - Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving [55.895917967408586]
Existing approaches to mathematical reasoning with large language models rely on Chain-of-Thought (CoT) for generalizability or Tool-Integrated Reasoning (TIR) for precise computation.
We propose TATA (Teaching LLMs According to Their Aptitude), an adaptive framework that enables LLMs to personalize their reasoning strategy spontaneously.
arXiv Detail & Related papers (2025-02-17T16:56:23Z) - Fair In-Context Learning via Latent Concept Variables [17.216196320585922]
Large language models (LLMs) can inherit social bias and discrimination from their pre-training data.
We design data augmentation strategies that reduce correlation between predictive outcomes and sensitive variables helping to promote fairness during latent concept learning.
arXiv Detail & Related papers (2024-11-04T23:10:05Z) - Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.
We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.
Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - In-Context Learning with Reinforcement Learning for Incomplete Utterance Rewriting [33.89176174108559]
In-context learning of large language models (LLMs) makes predictions only based on instructions augmented with a few examples.
Existing example selection methods for ICL utilize sparse or dense retrievers and derive effective performance.
We propose our policy-based reinforcement learning framework for example selection (RLS), which consists of a language model (LM) selector and an LLM generator.
arXiv Detail & Related papers (2024-08-23T12:32:12Z) - Causal-Guided Active Learning for Debiasing Large Language Models [40.853803921563596]
Current generative large language models (LLMs) may still capture dataset biases and utilize them for generation.
Previous prior-knowledge-based debiasing methods and fine-tuning-based debiasing methods may not be suitable for current LLMs.
We propose a casual-guided active learning framework, which utilizes LLMs itself to automatically and autonomously identify informative biased samples and induce the bias patterns.
arXiv Detail & Related papers (2024-08-23T09:46:15Z) - LLM-Select: Feature Selection with Large Language Models [64.5099482021597]
Large language models (LLMs) are capable of selecting the most predictive features, with performance rivaling the standard tools of data science.
Our findings suggest that LLMs may be useful not only for selecting the best features for training but also for deciding which features to collect in the first place.
arXiv Detail & Related papers (2024-07-02T22:23:40Z) - Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing
Framework [20.753141804841]
Large language models (LLMs) can easily generate biased and discriminative responses.
This paper focuses on social bias, tackling the association between demographic information and LLM outputs.
arXiv Detail & Related papers (2024-03-13T17:46:28Z) - ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs [65.9625653425636]
Large Language models (LLMs) exhibit harmful social biases.
This work introduces a novel approach utilizing ChatGPT to generate synthetic training data.
arXiv Detail & Related papers (2024-02-19T01:28:48Z) - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models [52.98743860365194]
We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN)
At the heart of SPIN lies a self-play mechanism, where the LLM refines its capability by playing against instances of itself.
This sheds light on the promise of self-play, enabling the achievement of human-level performance in LLMs without the need for expert opponents.
arXiv Detail & Related papers (2024-01-02T18:53:13Z) - On Learning to Summarize with Large Language Models as References [101.79795027550959]
Large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets.
We study an LLM-as-reference learning setting for smaller text summarization models to investigate whether their performance can be substantially improved.
arXiv Detail & Related papers (2023-05-23T16:56:04Z) - Robust Prompt Optimization for Large Language Models Against
Distribution Shifts [80.6757997074956]
Large Language Model (LLM) has demonstrated significant ability in various Natural Language Processing tasks.
We propose a new problem of robust prompt optimization for LLMs against distribution shifts.
This problem requires the prompt optimized over the labeled source group can simultaneously generalize to an unlabeled target group.
arXiv Detail & Related papers (2023-05-23T11:30:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.