Related papers: Uncovering Biases with Reflective Large Language Models

Related papers

Investigating the Effects of Cognitive Biases in Prompts on Large Language Model Outputs [3.7302076138352205]
This paper investigates the influence of cognitive biases on Large Language Models (LLMs) outputs.<n> cognitive biases, such as confirmation and availability biases, can distort user inputs through prompts.
arXiv Detail & Related papers (2025-06-14T04:18:34Z)
MLLMs are Deeply Affected by Modality Bias [158.64371871084478]
Recent advances in Multimodal Large Language Models (MLLMs) have shown promising results in integrating diverse modalities such as texts and images.<n>MLLMs are heavily influenced by modality bias, often relying on language while under-utilizing other modalities like visual inputs.<n>This paper argues that MLLMs are deeply affected by modality bias, highlighting its manifestations across various tasks.
arXiv Detail & Related papers (2025-05-24T11:49:31Z)
Relative Bias: A Comparative Framework for Quantifying Bias in LLMs [29.112649816695203]
Relative Bias is a method designed to assess how an LLM's behavior deviates from other LLMs within a specified target domain.<n>We introduce two complementary methodologies: (1) Embedding Transformation analysis, which captures relative bias patterns through sentence representations over the embedding space, and (2) LLM-as-a-Judge, which employs a language model to evaluate outputs comparatively.<n>Applying our framework to several case studies on bias and alignment scenarios following by statistical tests for validation, we find strong alignment between the two scoring methods.
arXiv Detail & Related papers (2025-05-22T01:59:54Z)
Metamorphic Testing for Fairness Evaluation in Large Language Models: Identifying Intersectional Bias in LLaMA and GPT [2.380039717474099]
Large Language Models (LLMs) have made significant strides in Natural Language Processing but remain vulnerable to fairness-related issues. This paper introduces a metamorphic testing approach to systematically identify fairness bugs in LLMs.
arXiv Detail & Related papers (2025-04-04T21:04:14Z)
No LLM is Free From Bias: A Comprehensive Study of Bias Evaluation in Large Language models [0.9620910657090186]
Large Language Models (LLMs) have increased the performance of different natural language understanding as well as generation tasks. Although LLMs have breached the state-of-the-art performance in various tasks, they often reflect different forms of bias present in the training data. We provide a unified evaluation of benchmarks using a set of representative LLMs that cover different forms of biases starting from physical characteristics to socio-economic categories.
arXiv Detail & Related papers (2025-03-15T03:58:14Z)
VilBias: A Study of Bias Detection through Linguistic and Visual Cues , presenting Annotation Strategies, Evaluation, and Key Challenges [2.2751168722976587]
VLBias is a framework that leverages state-of-the-art Large Language Models (LLMs) and Vision-Language Models (VLMs) to detect linguistic and visual biases in news content. We present a multimodal dataset comprising textual content and corresponding images from diverse news sources.
arXiv Detail & Related papers (2024-12-22T15:05:30Z)
A Multi-LLM Debiasing Framework [85.17156744155915]
Large Language Models (LLMs) are powerful tools with the potential to benefit society immensely, yet, they have demonstrated biases that perpetuate societal inequalities. Recent research has shown a growing interest in multi-LLM approaches, which have been demonstrated to be effective in improving the quality of reasoning. We propose a novel multi-LLM debiasing framework aimed at reducing bias in LLMs.
arXiv Detail & Related papers (2024-09-20T20:24:50Z)
Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data [9.90951705988724]
Large Language Models (LLM) are prone to inheriting and amplifying societal biases. LLM bias can have far-reaching consequences, leading to unfair practices and exacerbating social inequalities.
arXiv Detail & Related papers (2024-08-20T23:54:26Z)
Social Debiasing for Fair Multi-modal LLMs [55.8071045346024]
Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities. However, these models often inherit severe social biases from their training datasets, leading to unfair predictions based on attributes like race and gender. This paper addresses the issue of social biases in MLLMs by i) Introducing a comprehensive Counterfactual dataset with Multiple Social Concepts (CMSC) and ii) Proposing an Anti-Stereotype Debiasing strategy (ASD)
arXiv Detail & Related papers (2024-08-13T02:08:32Z)
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models [58.57987316300529]
Large Language Models (LLMs) are increasingly deployed to handle various natural language processing (NLP) tasks. To evaluate the biases exhibited by LLMs, researchers have recently proposed a variety of datasets. We propose CEB, a Compositional Evaluation Benchmark that covers different types of bias across different social groups and tasks.
arXiv Detail & Related papers (2024-07-02T16:31:37Z)
RUPBench: Benchmarking Reasoning Under Perturbations for Robustness Evaluation in Large Language Models [12.112914393948415]
We present RUPBench, a benchmark designed to evaluate large language models (LLMs) across diverse reasoning tasks. Our benchmark incorporates 15 reasoning datasets, categorized into commonsense, arithmetic, logical, and knowledge-intensive reasoning. By examining the performance of state-of-the-art LLMs such as GPT-4o, Llama3, Phi-3, and Gemma on both original and perturbed datasets, we provide a detailed analysis of their robustness and error patterns.
arXiv Detail & Related papers (2024-06-16T17:26:44Z)
Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends [38.86240794422485]
We evaluate the faithfulness of large language models for dialogue summarization. Our evaluation reveals subtleties as to what constitutes a hallucination. We introduce two prompt-based approaches for fine-grained error detection that outperform existing metrics.
arXiv Detail & Related papers (2024-06-05T17:49:47Z)
Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing. Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image. To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z)
On the Out-Of-Distribution Generalization of Multimodal Large Language Models [24.431960338495184]
We investigate the generalization boundaries of current Multimodal Large Language Models (MLLMs) We evaluate their zero-shot generalization across synthetic images, real-world distributional shifts, and specialized datasets like medical and molecular imagery. We show that in-context learning can significantly enhance MLLMs' generalization, opening new avenues for overcoming generalization barriers.
arXiv Detail & Related papers (2024-02-09T18:21:51Z)
Large Language Model (LLM) Bias Index -- LLMBI [0.0]
The Large Language Model Bias Index (LLMBI) is a pioneering approach designed to quantify and address biases inherent in large language models (LLMs) We formulated LLMBI using a composite scoring system incorporating multiple dimensions of bias, including but not limited to age, gender, and racial biases. Our empirical analysis, conducted using responses from OpenAI's API, employs advanced sentiment analysis as a representative method for bias detection.
arXiv Detail & Related papers (2023-12-22T15:38:13Z)
Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation. We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process. We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z)
Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools. Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions. Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.