Evaluating and Mitigating Bias in AI-Based Medical Text Generation
- URL: http://arxiv.org/abs/2504.17279v1
- Date: Thu, 24 Apr 2025 06:10:40 GMT
- Title: Evaluating and Mitigating Bias in AI-Based Medical Text Generation
- Authors: Xiuying Chen, Tairan Wang, Juexiao Zhou, Zirui Song, Xin Gao, Xiangliang Zhang,
- Abstract summary: AI systems may reflect and amplify human bias, and reduce the quality of their performance in historically under-served populations.<n>In this study, we investigate the fairness problem in text generation within the medical field.<n>We propose an algorithm that selectively optimize those underperformed groups to reduce bias.
- Score: 35.24191727599811
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial intelligence (AI) systems, particularly those based on deep learning models, have increasingly achieved expert-level performance in medical applications. However, there is growing concern that such AI systems may reflect and amplify human bias, and reduce the quality of their performance in historically under-served populations. The fairness issue has attracted considerable research interest in the medical imaging classification field, yet it remains understudied in the text generation domain. In this study, we investigate the fairness problem in text generation within the medical field and observe significant performance discrepancies across different races, sexes, and age groups, including intersectional groups, various model scales, and different evaluation metrics. To mitigate this fairness issue, we propose an algorithm that selectively optimizes those underperformed groups to reduce bias. The selection rules take into account not only word-level accuracy but also the pathology accuracy to the target reference, while ensuring that the entire process remains fully differentiable for effective model training. Our evaluations across multiple backbones, datasets, and modalities demonstrate that our proposed algorithm enhances fairness in text generation without compromising overall performance. Specifically, the disparities among various groups across different metrics were diminished by more than 30% with our algorithm, while the relative change in text generation accuracy was typically within 2%. By reducing the bias generated by deep learning models, our proposed approach can potentially alleviate concerns about the fairness and reliability of text generation diagnosis in medical domain. Our code is publicly available to facilitate further research at https://github.com/iriscxy/GenFair.
Related papers
- Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection [60.09665704993751]
We introduce FairOPT, an algorithm for group-specific threshold optimization in AI-generated content classifiers.<n>Our approach partitions data into subgroups based on attributes (e.g., text length and writing style) and learns decision thresholds for each group.<n>Our framework paves the way for more robust and fair classification criteria in AI-generated output detection.
arXiv Detail & Related papers (2025-02-06T21:58:48Z) - A Data-Centric Approach to Detecting and Mitigating Demographic Bias in Pediatric Mental Health Text: A Case Study in Anxiety Detection [3.874958704454859]
We developed a data-centric de-biasing framework to address gender-based content disparities within clinical text.<n>Our approach demonstrates an effective strategy for mitigating bias in AI healthcare models trained on text.
arXiv Detail & Related papers (2024-12-30T20:00:22Z) - Biasing & Debiasing based Approach Towards Fair Knowledge Transfer for Equitable Skin Analysis [16.638722872021095]
We propose a two-biased teachers' based approach to transfer fair knowledge into the student network.
Our approach mitigates biases present in the student network without harming its predictive accuracy.
arXiv Detail & Related papers (2024-05-16T17:02:23Z) - A survey of recent methods for addressing AI fairness and bias in
biomedicine [48.46929081146017]
Artificial intelligence systems may perpetuate social inequities or demonstrate biases, such as those based on race or gender.
We surveyed recent publications on different debiasing methods in the fields of biomedical natural language processing (NLP) or computer vision (CV)
We performed a literature search on PubMed, ACM digital library, and IEEE Xplore of relevant articles published between January 2018 and December 2023 using multiple combinations of keywords.
We reviewed other potential methods from the general domain that could be applied to biomedicine to address bias and improve fairness.
arXiv Detail & Related papers (2024-02-13T06:38:46Z) - The Limits of Fair Medical Imaging AI In The Wild [43.97266228706059]
We investigate the extent to which medical AI utilizes demographic encodings.
We confirm that medical imaging AI leverages demographic shortcuts in disease classification.
We find that models with less encoding of demographic attributes are often most "globally optimal"
arXiv Detail & Related papers (2023-12-11T18:59:50Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - MEDFAIR: Benchmarking Fairness for Medical Imaging [44.73351338165214]
MEDFAIR is a framework to benchmark the fairness of machine learning models for medical imaging.
We find that the under-studied issue of model selection criterion can have a significant impact on fairness outcomes.
We make recommendations for different medical application scenarios that require different ethical principles.
arXiv Detail & Related papers (2022-10-04T16:30:47Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Improving Fairness of AI Systems with Lossless De-biasing [15.039284892391565]
Mitigating bias in AI systems to increase overall fairness has emerged as an important challenge.
We present an information-lossless de-biasing technique that targets the scarcity of data in the disadvantaged group.
arXiv Detail & Related papers (2021-05-10T17:38:38Z) - Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system.
Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model.
We evaluate our framework on a large-scale public-available skin lesion dataset.
arXiv Detail & Related papers (2021-03-07T03:10:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.