From Detection to Mitigation: Addressing Gender Bias in Chinese Texts via Efficient Tuning and Voting-Based Rebalancing
- URL: http://arxiv.org/abs/2509.07889v1
- Date: Tue, 09 Sep 2025 16:12:11 GMT
- Title: From Detection to Mitigation: Addressing Gender Bias in Chinese Texts via Efficient Tuning and Voting-Based Rebalancing
- Authors: Chengyan Wu, Yiqiang Cai, Yufei Cheng, Yun Xue,
- Abstract summary: This paper presents our team's solution to Shared Task 7 of NLPCC-2025, which focuses on sentence-level gender bias detection and mitigation in Chinese.<n>We adopt a fine-tuning approach based on large language models (LLMs), efficiently adapt to the bias detection task via Low-Rank Adaptation (LoRA)<n>Our method ultimately achieves an average score of 47.90%, ranking fourth in the shared task.
- Score: 4.501499967947747
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents our team's solution to Shared Task 7 of NLPCC-2025, which focuses on sentence-level gender bias detection and mitigation in Chinese. The task aims to promote fairness and controllability in natural language generation by automatically detecting, classifying, and mitigating gender bias. To address this challenge, we adopt a fine-tuning approach based on large language models (LLMs), efficiently adapt to the bias detection task via Low-Rank Adaptation (LoRA). In terms of data processing, we construct a more balanced training set to alleviate class imbalance and introduce heterogeneous samples from multiple sources to enhance model generalization. For the detection and classification sub-tasks, we employ a majority voting strategy that integrates outputs from multiple expert models to boost performance. Additionally, to improve bias generation detection and mitigation, we design a multi-temperature sampling mechanism to capture potential variations in bias expression styles. Experimental results demonstrate the effectiveness of our approach in bias detection, classification, and mitigation. Our method ultimately achieves an average score of 47.90%, ranking fourth in the shared task.
Related papers
- Detection, Classification, and Mitigation of Gender Bias in Large Language Models [6.762310697831219]
We investigate how to enhance the capabilities of large language models (LLMs) in gender bias detection, classification, and mitigation.<n>We adopt reinforcement learning, chain-of-thoughts reasoning, and supervised fine-tuning to handle different Subtasks.<n>Our approach ranked first across all three subtasks of the NLPCC 2025 Shared Task 7.
arXiv Detail & Related papers (2025-06-14T14:53:25Z) - On the Interconnections of Calibration, Quantification, and Classifier Accuracy Prediction under Dataset Shift [58.91436551466064]
This paper investigates the interconnections among three fundamental problems, calibration, and quantification, under dataset shift conditions.<n>We show that access to an oracle for any one of these tasks enables the resolution of the other two.<n>We propose new methods for each problem based on direct adaptations of well-established methods borrowed from the other disciplines.
arXiv Detail & Related papers (2025-05-16T15:42:55Z) - Add-One-In: Incremental Sample Selection for Large Language Models via a Choice-Based Greedy Paradigm [50.492124556982674]
This paper introduces a novel choice-based sample selection framework.<n>It shifts the focus from evaluating individual sample quality to comparing the contribution value of different samples.<n>We validate our approach on a larger medical dataset, highlighting its practical applicability in real-world applications.
arXiv Detail & Related papers (2025-03-04T07:32:41Z) - Exploring Imbalanced Annotations for Effective In-Context Learning [41.618125904839424]
Large language models (LLMs) have shown impressive performance on downstream tasks through in-context learning (ICL)<n>In this work, we show that such class imbalances significantly degrade the ICL performance across various tasks.<n>We propose Conditional Reweighting with Conditional Bias (dubbed) to enhance ICL performance under class imbalance.
arXiv Detail & Related papers (2025-02-06T12:57:50Z) - STOP! Benchmarking Large Language Models with Sensitivity Testing on Offensive Progressions [6.19084217044276]
Mitigating explicit and implicit biases in Large Language Models (LLMs) has become a critical focus in the field of natural language processing.<n>We introduce the Sensitivity Testing on Offensive Progressions dataset, which includes 450 offensive progressions containing 2,700 unique sentences.<n>Our findings reveal that even the best-performing models detect bias inconsistently, with success rates ranging from 19.3% to 69.8%.
arXiv Detail & Related papers (2024-09-20T18:34:38Z) - Debiasing Multimodal Large Language Models via Penalization of Language Priors [38.97645845493758]
Multimodal Large Language Models (MLLMs) have become indispensable tools in computer vision and natural language processing.<n>Despite their advancements, our investigation reveals a noteworthy bias: the generated content is often driven more by the inherent priors of the underlying Large Language Models (LLMs) than by the input image.<n>We propose two simple, training-free strategies to rectify these biases and redirect the model's focus toward visual information.
arXiv Detail & Related papers (2024-03-08T12:35:07Z) - Identifying and Adapting Transformer-Components Responsible for Gender
Bias in an English Language Model [1.6343144783668118]
Language models (LMs) exhibit and amplify many types of undesirable biases learned from the training data, including gender bias.
We study three methods for identifying causal relations between LM components and particular output.
We apply the methods to GPT-2 small and the problem of gender bias, and use the discovered sets of components to perform parameter-efficient fine-tuning for bias mitigation.
arXiv Detail & Related papers (2023-10-19T09:39:21Z) - D-CALM: A Dynamic Clustering-based Active Learning Approach for
Mitigating Bias [13.008323851750442]
In this paper, we propose a novel adaptive clustering-based active learning algorithm, D-CALM, that dynamically adjusts clustering and annotation efforts.
Experiments on eight datasets for a diverse set of text classification tasks, including emotion, hatespeech, dialog act, and book type detection, demonstrate that our proposed algorithm significantly outperforms baseline AL approaches.
arXiv Detail & Related papers (2023-05-26T15:17:43Z) - Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy.
We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples.
Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z) - Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous
Pronouns [53.62845317039185]
Bias-measuring datasets play a critical role in detecting biased behavior of language models.
We propose a novel method to collect diverse, natural, and minimally distant text pairs via counterfactual generation.
We show that four pre-trained language models are significantly more inconsistent across different gender groups than within each group.
arXiv Detail & Related papers (2023-02-11T12:11:03Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.