Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models
- URL: http://arxiv.org/abs/2502.11559v1
- Date: Mon, 17 Feb 2025 08:44:04 GMT
- Title: Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models
- Authors: Yue Xu, Chengyan Fu, Li Xiong, Sibei Yang, Wenjie Wang,
- Abstract summary: Pre-training large language models (LLMs) on vast text corpora enhances natural language processing capabilities but risks encoding social biases, particularly gender bias.<n>We propose $textitFaIRMaker$, an automated and model-independent framework that employs a paradigm to adaptively generate Fairwords.<n>Experiments demonstrate that $textitFaIRMaker$ automatically searches for and dynamically refines Fairwords, effectively mitigating gender bias while preserving task integrity.
- Score: 25.291029168327874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-training large language models (LLMs) on vast text corpora enhances natural language processing capabilities but risks encoding social biases, particularly gender bias. While parameter-modification methods like fine-tuning mitigate bias, they are resource-intensive, unsuitable for closed-source models, and lack adaptability to evolving societal norms. Instruction-based approaches offer flexibility but often compromise task performance. To address these limitations, we propose $\textit{FaIRMaker}$, an automated and model-independent framework that employs an $\textbf{auto-search and refinement}$ paradigm to adaptively generate Fairwords, which act as instructions integrated into input queries to reduce gender bias and enhance response quality. Extensive experiments demonstrate that $\textit{FaIRMaker}$ automatically searches for and dynamically refines Fairwords, effectively mitigating gender bias while preserving task integrity and ensuring compatibility with both API-based and open-source LLMs.
Related papers
- Generalizing Large Language Model Usability Across Resource-Constrained [0.43512163406552007]
dissertation presents a systematic study toward generalizing Large Language Models under real-world constraints.<n>First, it introduces a robust text-centric alignment framework that enables LLMs to seamlessly integrate diverse modalities.<n>Beyond multimodal setting, the dissertation investigates inference-time optimization strategies for LLMs.
arXiv Detail & Related papers (2025-05-13T01:00:12Z) - Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [60.881609323604685]
Large Language Models (LLMs) accessed via black-box APIs introduce a trust challenge.
Users pay for services based on advertised model capabilities.
providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs.
This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking.
arXiv Detail & Related papers (2025-04-07T03:57:41Z) - Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models.
Our framework incorporates two complementary strategies: internal TTC and external TTC.
We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z) - Synthetic Data Generation Using Large Language Models: Advances in Text and Code [0.0]
Large language models (LLMs) have unlocked new possibilities for generating synthetic training data in both natural language and code.
We show how these methods enrich low-resource tasks such as classification and question answering.
We address challenges like factual inaccuracies in generated text, lack of stylistic realism, and the risk of bias amplification.
arXiv Detail & Related papers (2025-03-18T08:34:03Z) - $\ exttt{SEM-CTRL}$: Semantically Controlled Decoding [53.86639808659575]
$texttSEM-CTRL$ is a unified approach that enforces rich context-sensitive constraints and task- and instance-specific semantics directly on an LLM decoder.<n>texttSEM-CTRL$ allows small pre-trained LLMs to efficiently outperform larger variants and state-of-the-art reasoning models.
arXiv Detail & Related papers (2025-03-03T18:33:46Z) - DR.GAP: Mitigating Bias in Large Language Models using Gender-Aware Prompting with Demonstration and Reasoning [14.690803375468661]
Large Language Models (LLMs) exhibit strong natural language processing capabilities but also inherit and amplify societal biases, including gender bias, raising fairness concerns.<n>We propose DR.GAP (Demonstration and Reasoning for Gender-Aware Prompting), an automated and model-agnostic approach that mitigates gender bias while preserving model performance.
arXiv Detail & Related papers (2025-02-17T09:43:36Z) - The Open Source Advantage in Large Language Models (LLMs) [0.0]
Large language models (LLMs) have rapidly advanced natural language processing, driving significant breakthroughs in tasks such as text generation, machine translation, and domain-specific reasoning.<n>The field now faces a critical dilemma in its approach: closed-source models like GPT-4 deliver state-of-the-art performance but restrict accessibility, and external oversight.<n>Open-source frameworks like LLaMA and Mixtral democratize access, foster collaboration, and support diverse applications, achieving competitive results through techniques like instruction tuning and LoRA.
arXiv Detail & Related papers (2024-12-16T17:32:11Z) - Autoformulation of Mathematical Optimization Models Using LLMs [50.030647274271516]
This paper approaches the problem of $textitautoformulation$: the automated creation of solver-ready optimization models from natural language problem descriptions.<n>We identify three core challenges of autoformulation: $textit(1)$ the vast, problem-dependent hypothesis space, and $textit(2)$ efficient and diverse exploration of this space under uncertainty.<n>We present a novel method leveraging $textitLarge Language Models$ with $textitMonte-Carlo Tree Search$, exploiting the hierarchical nature of optimization modeling to generate and systematically explore possible formulations
arXiv Detail & Related papers (2024-11-03T20:41:38Z) - Optimizing Large Language Models for Dynamic Constraints through Human-in-the-Loop Discriminators [0.0]
Large Language Models (LLMs) have recently demonstrated impressive capabilities across various real-world applications.
We propose a flexible framework that enables LLMs to interact with system interfaces, summarize constraint concepts, and continually optimize performance metrics.
Our framework achieved a $7.78%$ pass rate with the human discriminator and a $6.11%$ pass rate with the LLM-based discriminator.
arXiv Detail & Related papers (2024-10-19T17:27:38Z) - The GUS Framework: Benchmarking Social Bias Classification with Discriminative (Encoder-Only) and Generative (Decoder-Only) Language Models [3.7716682697752506]
Generalizations, Unfairness, and Stereotypes (the GUS framework) are considered as key linguistic components underlying social bias.
The GUS framework employs a semi-automated approach to create a comprehensive synthetic dataset, which is verified by humans to maintain ethical standards.
Our methodology, which combines discriminative (encoder-only) models and generative (auto-regressive large language models) identifies biased entities in text.
arXiv Detail & Related papers (2024-10-10T21:51:22Z) - Enhancing LLM-Based Text Classification in Political Science: Automatic Prompt Optimization and Dynamic Exemplar Selection for Few-Shot Learning [1.6967824074619953]
Large language models (LLMs) offer substantial promise for text classification in political science.
Our framework enhances LLM performance through automatic prompt optimization, dynamic exemplar selection, and a consensus mechanism.
An open-source Python package (PoliPrompt) is available on GitHub.
arXiv Detail & Related papers (2024-09-02T21:05:31Z) - Editable Fairness: Fine-Grained Bias Mitigation in Language Models [52.66450426729818]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.
FAST surpasses state-of-the-art baselines with superior debiasing performance.
This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z) - Unlocking the Potential of Model Merging for Low-Resource Languages [66.7716891808697]
Adapting large language models to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT)
We propose model merging as an alternative for low-resource languages, combining models with distinct capabilities into a single model without additional training.
Experiments based on Llama-2-7B demonstrate that model merging effectively endows LLMs for low-resource languages with task-solving abilities, outperforming CT-then-SFT in scenarios with extremely scarce data.
arXiv Detail & Related papers (2024-07-04T15:14:17Z) - CELA: Cost-Efficient Language Model Alignment for CTR Prediction [70.65910069412944]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.<n>Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)<n>We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z) - AXOLOTL: Fairness through Assisted Self-Debiasing of Large Language
Model Outputs [20.772266479533776]
AXOLOTL is a novel post-processing framework that operates agnostically across tasks and models.
It identifies biases, proposes resolutions, and guides the model to self-debias its outputs.
This approach minimizes computational costs and preserves model performance.
arXiv Detail & Related papers (2024-03-01T00:02:37Z) - Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges.
Our model is trained on user queries and LLM-generated responses under massive real-world scenarios.
Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z) - TextFlint: Unified Multilingual Robustness Evaluation Toolkit for
Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint)
It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis.
TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.