BTC-SAM: Leveraging LLMs for Generation of Bias Test Cases for Sentiment Analysis Models
- URL: http://arxiv.org/abs/2509.24101v2
- Date: Wed, 15 Oct 2025 09:50:59 GMT
- Title: BTC-SAM: Leveraging LLMs for Generation of Bias Test Cases for Sentiment Analysis Models
- Authors: Zsolt T. Kardkovacs, Lynda Djennane, Anna Field, Boualem Benatallah, Yacine Gaci, Fabio Casati, Walid Gaaloul,
- Abstract summary: Sentiment Analysis (SA) models harbor inherent social biases that can be harmful in real-world applications.<n>We present a novel bias testing framework, BTC-SAM, which generates high-quality test cases for bias testing in SA models with minimal specification.
- Score: 1.5637023740732419
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sentiment Analysis (SA) models harbor inherent social biases that can be harmful in real-world applications. These biases are identified by examining the output of SA models for sentences that only vary in the identity groups of the subjects. Constructing natural, linguistically rich, relevant, and diverse sets of sentences that provide sufficient coverage over the domain is expensive, especially when addressing a wide range of biases: it requires domain experts and/or crowd-sourcing. In this paper, we present a novel bias testing framework, BTC-SAM, which generates high-quality test cases for bias testing in SA models with minimal specification using Large Language Models (LLMs) for the controllable generation of test sentences. Our experiments show that relying on LLMs can provide high linguistic variation and diversity in the test sentences, thereby offering better test coverage compared to base prompting methods even for previously unseen biases.
Related papers
- BiasLab: A Multilingual, Dual-Framing Framework for Robust Measurement of Output-Level Bias in Large Language Models [3.643198597030366]
This paper introduces BiasLab, an open-source, model-agnostic evaluation framework for quantifying output-level (extrinsic) bias.<n>The framework supports evaluation across diverse bias axes, including demographic, cultural, political, and geopolitical topics.
arXiv Detail & Related papers (2026-01-11T11:07:46Z) - Addressing Stereotypes in Large Language Models: A Critical Examination and Mitigation [0.0]
Large Language models (LLMs) have gained popularity in recent years with the advancement of Natural Language Processing (NLP)<n>This study inspects and highlights the need to address biases in LLMs amid growing generative Artificial Intelligence (AI)<n>We utilize bias-specific benchmarks such StereoSet and CrowSPairs to evaluate the existence of various biases in many different generative models such as BERT, GPT 3.5, and ADA.
arXiv Detail & Related papers (2025-11-18T05:43:34Z) - A Comprehensive Study of Implicit and Explicit Biases in Large Language Models [1.0555164678638427]
This study highlights the need to address biases in Large Language Models amid growing generative AI.<n>We studied bias-specific benchmarks such as StereoSet and CrowSPairs to evaluate the existence of various biases in multiple generative models such as BERT and GPT 3.5.<n>Results indicated fine-tuned models struggle with gender biases but excelled at identifying and avoiding racial biases.
arXiv Detail & Related papers (2025-11-18T05:27:17Z) - BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses [32.58830706120845]
Existing studies on bias mitigation methods for large language models (LLMs) use diverse baselines and metrics to evaluate debiasing performance.<n>We introduce BiasFreeBench, an empirical benchmark that comprehensively compares eight mainstream bias mitigation techniques.<n>We will publicly release our benchmark, aiming to establish a unified testbed for bias mitigation research.
arXiv Detail & Related papers (2025-09-30T19:56:54Z) - Relative Bias: A Comparative Framework for Quantifying Bias in LLMs [29.112649816695203]
Relative Bias is a method designed to assess how an LLM's behavior deviates from other LLMs within a specified target domain.<n>We introduce two complementary methodologies: (1) Embedding Transformation analysis, which captures relative bias patterns through sentence representations over the embedding space, and (2) LLM-as-a-Judge, which employs a language model to evaluate outputs comparatively.<n>Applying our framework to several case studies on bias and alignment scenarios following by statistical tests for validation, we find strong alignment between the two scoring methods.
arXiv Detail & Related papers (2025-05-22T01:59:54Z) - Challenges in Testing Large Language Model Based Software: A Faceted Taxonomy [8.927002750209295]
Large Language Models (LLMs) and Multi-Agent LLMs (MALLMs) introduce non-determinism unlike traditional or machine learning software.<n>This paper presents a taxonomy for LLM test case design, informed by research literature and our experience.
arXiv Detail & Related papers (2025-03-01T13:15:56Z) - OLMES: A Standard for Language Model Evaluations [64.85905119836818]
OLMES is a documented, practical, open standard for reproducible language model evaluations.<n>It supports meaningful comparisons between smaller base models that require the unnatural "cloze" formulation of multiple-choice questions.<n> OLMES includes well-considered, documented recommendations guided by results from existing literature as well as new experiments resolving open questions.
arXiv Detail & Related papers (2024-06-12T17:37:09Z) - LangBiTe: A Platform for Testing Bias in Large Language Models [1.9744907811058787]
Large Language Models (LLMs) are trained on a vast amount of data scrapped from forums, websites, social media and other internet sources.<n>LangBiTe enables development teams to tailor their test scenarios, and automatically generate and execute the test cases according to a set of user-defined ethical requirements.<n>LangBite provides users with the bias evaluation of LLMs, and end-to-end traceability between the initial ethical requirements and the insights obtained.
arXiv Detail & Related papers (2024-04-29T10:02:45Z) - Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation [49.3814117521631]
Standard benchmarks of bias and fairness in large language models (LLMs) measure the association between the user attributes stated or implied by a prompt.<n>We develop analogous RUTEd evaluations from three contexts of real-world use: children's bedtime stories, user personas, and English language learning exercises.<n>We find that standard bias metrics have no significant correlation with the more realistic bias metrics.
arXiv Detail & Related papers (2024-02-20T01:49:15Z) - Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges.
Our model is trained on user queries and LLM-generated responses under massive real-world scenarios.
Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z) - FairMonitor: A Four-Stage Automatic Framework for Detecting Stereotypes
and Biases in Large Language Models [10.57405233305553]
This paper introduces a four-stage framework to directly evaluate stereotypes and biases in the generated content of Large Language Models (LLMs)
Using the education sector as a case study, we constructed the Edu-FairMonitor based on the four-stage framework.
Experimental results reveal varying degrees of stereotypes and biases in five LLMs evaluated on Edu-FairMonitor.
arXiv Detail & Related papers (2023-08-21T00:25:17Z) - BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models [73.29106813131818]
bias testing is currently cumbersome since the test sentences are generated from a limited set of manual templates or need expensive crowd-sourcing.
We propose using ChatGPT for the controllable generation of test sentences, given any arbitrary user-specified combination of social groups and attributes.
We present an open-source comprehensive bias testing framework (BiasTestGPT), hosted on HuggingFace, that can be plugged into any open-source PLM for bias testing.
arXiv Detail & Related papers (2023-02-14T22:07:57Z) - Few-shot Instruction Prompts for Pretrained Language Models to Detect
Social Biases [55.45617404586874]
We propose a few-shot instruction-based method for prompting pre-trained language models (LMs)
We show that large LMs can detect different types of fine-grained biases with similar and sometimes superior accuracy to fine-tuned models.
arXiv Detail & Related papers (2021-12-15T04:19:52Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.