Related papers: Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values

Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values

URL: http://arxiv.org/abs/2502.00313v1
Date: Sat, 01 Feb 2025 04:24:47 GMT
Title: Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values
Authors: Hadi Hosseini, Samarth Khanna,
Abstract summary: A number of societal problems involve the distribution of resources, where fairness, along with economic efficiency, play a critical role in the desirability of outcomes.<n>This paper examines whether large language models (LLMs) adhere to fundamental fairness concepts and investigate their alignment with human preferences.
Score: 13.798198972161657
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The growing interest in employing large language models (LLMs) for decision-making in social and economic contexts has raised questions about their potential to function as agents in these domains. A significant number of societal problems involve the distribution of resources, where fairness, along with economic efficiency, play a critical role in the desirability of outcomes. In this paper, we examine whether LLM responses adhere to fundamental fairness concepts such as equitability, envy-freeness, and Rawlsian maximin, and investigate their alignment with human preferences. We evaluate the performance of several LLMs, providing a comparative benchmark of their ability to reflect these measures. Our results demonstrate a lack of alignment between current LLM responses and human distributional preferences. Moreover, LLMs are unable to utilize money as a transferable resource to mitigate inequality. Nonetheless, we demonstrate a stark contrast when (some) LLMs are tasked with selecting from a predefined menu of options rather than generating one. In addition, we analyze the robustness of LLM responses to variations in semantic factors (e.g. intentions or personas) or non-semantic prompting changes (e.g. templates or orderings). Finally, we highlight potential strategies aimed at enhancing the alignment of LLM behavior with well-established fairness concepts.

Related papers

Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games [87.5673042805229]
How large language models balance self-interest and collective well-being is a critical challenge for ensuring alignment, robustness, and safe deployment.<n>We adapt a public goods game with institutional choice from behavioral economics, allowing us to observe how different LLMs navigate social dilemmas.<n>Surprisingly, we find that reasoning LLMs, such as the o1 series, struggle significantly with cooperation.
arXiv Detail & Related papers (2025-06-29T15:02:47Z)
Alignment Revisited: Are Large Language Models Consistent in Stated and Revealed Preferences? [5.542420010310746]
A critical, yet understudied, issue is the potential divergence between an LLM's stated preferences and its revealed preferences.<n>This work formally defines and proposes a method to measure this preference deviation.<n>Our study will be crucial for integrating LLMs into services, especially those that interact directly with humans.
arXiv Detail & Related papers (2025-05-31T23:38:48Z)
Arbiters of Ambivalence: Challenges of Using LLMs in No-Consensus Tasks [52.098988739649705]
This study examines the biases and limitations of LLMs in three roles: answer generator, judge, and debater.<n>We develop a no-consensus'' benchmark by curating examples that encompass a variety of a priori ambivalent scenarios.<n>Our results show that while LLMs can provide nuanced assessments when generating open-ended answers, they tend to take a stance on no-consensus topics when employed as judges or debaters.
arXiv Detail & Related papers (2025-05-28T01:31:54Z)
Towards Characterizing Subjectivity of Individuals through Modeling Value Conflicts and Trade-offs [22.588557390720236]
We characterize subjectivity of individuals on social media and infer their moral judgments using Large Language Models. We propose a framework, SOLAR, that observes value conflicts and trade-offs in the user-generated texts to better represent subjective ground of individuals.
arXiv Detail & Related papers (2025-04-17T04:20:05Z)
An Overview of Large Language Models for Statisticians [109.38601458831545]
Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI) This paper explores potential areas where statisticians can make important contributions to the development of LLMs. We focus on issues such as uncertainty quantification, interpretability, fairness, privacy, watermarking and model adaptation.
arXiv Detail & Related papers (2025-02-25T03:40:36Z)
Chat Bankman-Fried: an Exploration of LLM Alignment in Finance [4.892013668424246]
As jurisdictions enact legislation on AI safety, the concept of alignment must be defined and measured. This paper proposes an experimental framework to assess whether large language models (LLMs) adhere to ethical and legal standards in the relatively unexplored context of finance.
arXiv Detail & Related papers (2024-11-01T08:56:17Z)
Gender Bias of LLM in Economics: An Existentialism Perspective [1.024113475677323]
This paper investigates gender bias in large language models (LLMs) LLMs reinforce gender stereotypes even without explicit gender markers. We argue that bias in LLMs is not an unintended flaw but a systematic result of their rational processing.
arXiv Detail & Related papers (2024-10-14T01:42:01Z)
Uncovering Factor Level Preferences to Improve Human-Model Alignment [58.50191593880829]
We introduce PROFILE, a framework that uncovers and quantifies the influence of specific factors driving preferences. ProFILE's factor level analysis explains the 'why' behind human-model alignment and misalignment. We demonstrate how leveraging factor level insights, including addressing misaligned factors, can improve alignment with human preferences.
arXiv Detail & Related papers (2024-10-09T15:02:34Z)
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge [84.34545223897578]
Despite their excellence in many domains, potential issues are under-explored, undermining their reliability and the scope of their utility. We identify 12 key potential biases and propose a new automated bias quantification framework-CALM- which quantifies and analyzes each type of bias in LLM-as-a-Judge. Our work highlights the need for stakeholders to address these issues and remind users to exercise caution in LLM-as-a-Judge applications.
arXiv Detail & Related papers (2024-10-03T17:53:30Z)
Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) are used to automate decision-making tasks. In this paper, we evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention. We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types. These benchmarks allow us to isolate the ability of LLMs to accurately predict changes resulting from their ability to memorize facts or find other shortcuts.
arXiv Detail & Related papers (2024-04-08T14:15:56Z)
Explaining Large Language Models Decisions Using Shapley Values [1.223779595809275]
Large language models (LLMs) have opened up exciting possibilities for simulating human behavior and cognitive processes. However, the validity of utilizing LLMs as stand-ins for human subjects remains uncertain. This paper presents a novel approach based on Shapley values to interpret LLM behavior and quantify the relative contribution of each prompt component to the model's output.
arXiv Detail & Related papers (2024-03-29T22:49:43Z)
Exploring Value Biases: How LLMs Deviate Towards the Ideal [57.99044181599786]
Large-Language-Models (LLMs) are deployed in a wide range of applications, and their response has an increasing social impact. We show that value bias is strong in LLMs across different categories, similar to the results found in human studies.
arXiv Detail & Related papers (2024-02-16T18:28:43Z)
Do LLM Agents Exhibit Social Behavior? [5.094340963261968]
State-Understanding-Value-Action (SUVA) is a framework to systematically analyze responses in social contexts. It assesses social behavior through both their final decisions and the response generation processes leading to those decisions. We demonstrate that utterance-based reasoning reliably predicts LLMs' final actions.
arXiv Detail & Related papers (2023-12-23T08:46:53Z)
CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark. In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship. We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z)
Heterogeneous Value Alignment Evaluation for Large Language Models [91.96728871418]
Large Language Models (LLMs) have made it crucial to align their values with those of humans. We propose a Heterogeneous Value Alignment Evaluation (HVAE) system to assess the success of aligning LLMs with heterogeneous values.
arXiv Detail & Related papers (2023-05-26T02:34:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.