Homogeneity Bias as Differential Sampling Uncertainty in Language Models
- URL: http://arxiv.org/abs/2501.19337v1
- Date: Fri, 31 Jan 2025 17:36:12 GMT
- Title: Homogeneity Bias as Differential Sampling Uncertainty in Language Models
- Authors: Messi H. J. Lee, Soyeon Jeon,
- Abstract summary: Large Language Models (LLMs) and Vision-Language Models (VLMs) represent marginalized groups more homogeneously than dominant groups.<n>We propose that this bias emerges from systematic differences in the probability distributions from which tokens are sampled at inference-time.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Prior research show that Large Language Models (LLMs) and Vision-Language Models (VLMs) represent marginalized groups more homogeneously than dominant groups. However, the mechanisms underlying this homogeneity bias remain relatively unexplored. We propose that this bias emerges from systematic differences in the probability distributions from which tokens are sampled at inference-time. Analyzing three measures of uncertainty in token sampling distributions-entropy, perplexity, and probability of differentiation-we find that in some models, specifically GPT-4 Turbo and Llama-3.2, tokens are sampled more deterministically when generating texts about marginalized groups (i.e., Black Americans and women) compared to their dominant group counterparts (i.e., White Americans and men). While these findings may help explain homogeneity bias in certain models, the patterns did not replicate across all VLMs tested, suggesting multiple mechanisms may contribute to homogeneity bias in AI.
Related papers
- Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs [51.00909549291524]
Large language models (LLMs) exhibit cognitive biases.<n>These biases vary across models and can be amplified by instruction tuning.<n>It remains unclear if these differences in biases stem from pretraining, finetuning, or even random noise.
arXiv Detail & Related papers (2025-07-09T18:01:14Z) - On the Origins of Sampling Bias: Implications on Fairness Measurement and Mitigation [0.0]
Several sources of bias exist and it is assumed that bias resulting from machine learning is born equally by different groups.
Sampling bias, in particular, is inconsistently used in the literature to describe bias due to the sampling procedure.
We introduce clearly defined variants of sampling bias, namely, sample size bias ( SSB) and underrepresentation bias (URB)
arXiv Detail & Related papers (2025-03-23T06:23:07Z) - Evaluating Binary Decision Biases in Large Language Models: Implications for Fair Agent-Based Financial Simulations [15.379345372327375]
Large Language Models (LLMs) are increasingly being used to simulate human-like decision making in agent-based financial market models.<n>We test three state-of-the-art GPT models for bias using two model sampling approaches: one-shot and few-shot API queries.
arXiv Detail & Related papers (2025-01-20T10:36:51Z) - Examining the Robustness of Homogeneity Bias to Hyperparameter Adjustments in GPT-4 [0.0]
Vision-Language Models trained on massive collections of human-generated data often reproduce and amplify societal stereotypes.<n>We investigate how this bias responds to hyper parameter adjustments in GPT-4.<n>We find that Black Americans and women are represented more homogeneously than White Americans and men.
arXiv Detail & Related papers (2025-01-04T06:51:49Z) - Probability of Differentiation Reveals Brittleness of Homogeneity Bias in GPT-4 [0.0]
Homogeneity bias in Large Language Models (LLMs) refers to their tendency to homogenize the representations of some groups compared to others.<n>Previous studies documenting this bias have predominantly used encoder models, which may have inadvertently introduced biases.<n>This study directly assessed homogeneity bias from the model's outputs, bypassing encoder models.
arXiv Detail & Related papers (2024-07-10T02:56:55Z) - It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep
Models [51.66015254740692]
We show that for an ensemble of deep learning based classification models, bias and variance are emphaligned at a sample level.
We study this phenomenon from two theoretical perspectives: calibration and neural collapse.
arXiv Detail & Related papers (2023-10-13T17:06:34Z) - Shedding light on underrepresentation and Sampling Bias in machine
learning [0.0]
We show how discrimination can be decomposed into variance, bias, and noise.
We challenge the commonly accepted mitigation approach that discrimination can be addressed by collecting more samples of the underrepresented group.
arXiv Detail & Related papers (2023-06-08T09:34:20Z) - Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems.
Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts.
We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z) - Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome
Homogenization? [90.35044668396591]
A recurring theme in machine learning is algorithmic monoculture: the same systems, or systems that share components, are deployed by multiple decision-makers.
We propose the component-sharing hypothesis: if decision-makers share components like training data or specific models, then they will produce more homogeneous outcomes.
We test this hypothesis on algorithmic fairness benchmarks, demonstrating that sharing training data reliably exacerbates homogenization.
We conclude with philosophical analyses of and societal challenges for outcome homogenization, with an eye towards implications for deployed machine learning systems.
arXiv Detail & Related papers (2022-11-25T09:33:11Z) - Statistical Properties of the Entropy from Ordinal Patterns [55.551675080361335]
Knowing the joint distribution of the pair Entropy-Statistical Complexity for a large class of time series models would allow statistical tests that are unavailable to date.
We characterize the distribution of the empirical Shannon's Entropy for any model under which the true normalized Entropy is neither zero nor one.
We present a bilateral test that verifies if there is enough evidence to reject the hypothesis that two signals produce ordinal patterns with the same Shannon's Entropy.
arXiv Detail & Related papers (2022-09-15T23:55:58Z) - Domain Adaptation meets Individual Fairness. And they get along [48.95808607591299]
We show that algorithmic fairness interventions can help machine learning models overcome distribution shifts.
In particular, we show that enforcing suitable notions of individual fairness (IF) can improve the out-of-distribution accuracy of ML models.
arXiv Detail & Related papers (2022-05-01T16:19:55Z) - Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group.
We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - A Generative Approach for Mitigating Structural Biases in Natural
Language Inference [24.44419010439227]
In this work, we reformulate the NLI task as a generative task, where a model is conditioned on the biased subset of the input and the label.
We show that this approach is highly robust to large amounts of bias.
We find that generative models are difficult to train and they generally perform worse than discriminative baselines.
arXiv Detail & Related papers (2021-08-31T17:59:45Z) - LOGAN: Local Group Bias Detection by Clustering [86.38331353310114]
We argue that evaluating bias at the corpus level is not enough for understanding how biases are embedded in a model.
We propose LOGAN, a new bias detection technique based on clustering.
Experiments on toxicity classification and object classification tasks show that LOGAN identifies bias in a local region.
arXiv Detail & Related papers (2020-10-06T16:42:51Z) - Contextuality scenarios arising from networks of stochastic processes [68.8204255655161]
An empirical model is said contextual if its distributions cannot be obtained marginalizing a joint distribution over X.
We present a different and classical source of contextual empirical models: the interaction among many processes.
The statistical behavior of the network in the long run makes the empirical model generically contextual and even strongly contextual.
arXiv Detail & Related papers (2020-06-22T16:57:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.