Investigating Label Bias and Representational Sources of Age-Related Disparities in Medical Segmentation
- URL: http://arxiv.org/abs/2511.00477v1
- Date: Sat, 01 Nov 2025 10:06:30 GMT
- Title: Investigating Label Bias and Representational Sources of Age-Related Disparities in Medical Segmentation
- Authors: Aditya Parikh, Sneha Das, Aasa Feragen,
- Abstract summary: Algorithmic bias in medical imaging can perpetuate health disparities.<n>In breast cancer segmentation, models exhibit significant performance disparities against younger patients.<n>This work introduces a systematic framework for diagnosing algorithmic bias in medical segmentation.
- Score: 8.774604259603304
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Algorithmic bias in medical imaging can perpetuate health disparities, yet its causes remain poorly understood in segmentation tasks. While fairness has been extensively studied in classification, segmentation remains underexplored despite its clinical importance. In breast cancer segmentation, models exhibit significant performance disparities against younger patients, commonly attributed to physiological differences in breast density. We audit the MAMA-MIA dataset, establishing a quantitative baseline of age-related bias in its automated labels, and reveal a critical Biased Ruler effect where systematically flawed labels for validation misrepresent a model's actual bias. However, whether this bias originates from lower-quality annotations (label bias) or from fundamentally more challenging image characteristics remains unclear. Through controlled experiments, we systematically refute hypotheses that the bias stems from label quality sensitivity or quantitative case difficulty imbalance. Balancing training data by difficulty fails to mitigate the disparity, revealing that younger patient cases are intrinsically harder to learn. We provide direct evidence that systemic bias is learned and amplified when training on biased, machine-generated labels, a critical finding for automated annotation pipelines. This work introduces a systematic framework for diagnosing algorithmic bias in medical segmentation and demonstrates that achieving fairness requires addressing qualitative distributional differences rather than merely balancing case counts.
Related papers
- Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware Reweighting [0.4784604186682396]
This study introduces a distribution-based framework for evaluating and mitigating individual fairness in skin lesion classification.<n>We treat skin tone as a continuous attribute rather than a categorical label, and employ kernel density estimation (KDE) to model its distribution.
arXiv Detail & Related papers (2025-12-09T15:45:20Z) - Selective Mixup for Debiasing Question Selection in Computerized Adaptive Testing [50.805231979748434]
Computerized Adaptive Testing (CAT) is a widely used technology for evaluating learners' proficiency in online education platforms.<n> Selection Bias arises because the question selection is strongly influenced by the estimated proficiency.<n>We propose a debiasing framework consisting of two key modules: Cross-Attribute Examinee Retrieval and Selective Mixup-based Regularization.
arXiv Detail & Related papers (2025-11-19T08:55:01Z) - Who Does Your Algorithm Fail? Investigating Age and Ethnic Bias in the MAMA-MIA Dataset [8.774604259603304]
We audit the fairness of the automated segmentation labels provided in the breast cancer tumor segmentation dataset MAMA-MIA.<n>Our analysis reveals an intrinsic age-related bias against younger patients that continues to persist even after controlling for confounding factors, such as data source.
arXiv Detail & Related papers (2025-10-31T12:20:31Z) - FairREAD: Re-fusing Demographic Attributes after Disentanglement for Fair Medical Image Classification [3.615240611746158]
We propose Fair Re-fusion After Disentanglement (FairREAD), a framework that mitigates unfairness by re-integrating sensitive demographic attributes into fair image representations.<n>FairREAD employs adversarial training to disentangle demographic information while using a controlled re-fusion mechanism to preserve clinically relevant details.<n> Comprehensive evaluations on a large-scale clinical X-ray dataset demonstrate that FairREAD significantly reduces unfairness metrics while maintaining diagnostic accuracy.
arXiv Detail & Related papers (2024-12-20T22:17:57Z) - Achieving Reliable and Fair Skin Lesion Diagnosis via Unsupervised Domain Adaptation [43.1078084014722]
Unsupervised domain adaptation (UDA) can integrate large external datasets for developing reliable classifiers.
UDA can effectively mitigate bias against minority groups and enhance fairness in diagnostic systems.
arXiv Detail & Related papers (2023-07-06T17:32:38Z) - M$^3$Fair: Mitigating Bias in Healthcare Data through Multi-Level and
Multi-Sensitive-Attribute Reweighting Method [13.253174531040106]
We propose M3Fair, a multi-level and multi-sensitive-attribute reweighting method by extending the RW method to multiple sensitive attributes at multiple levels.
Our experiments on real-world datasets show that the approach is effective, straightforward, and generalizable in addressing the healthcare fairness issues.
arXiv Detail & Related papers (2023-06-07T03:20:44Z) - Towards unraveling calibration biases in medical image analysis [2.4054878434935074]
We show how several typically employed calibration metrics are systematically biased with respect to sample sizes.
This is of particular relevance to fairness studies, where data imbalance results in drastic sample size differences between demographic sub-groups.
arXiv Detail & Related papers (2023-05-09T00:11:35Z) - Self-supervised debiasing using low rank regularization [59.84695042540525]
Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability.
We propose a self-supervised debiasing framework potentially compatible with unlabeled samples.
Remarkably, the proposed debiasing framework significantly improves the generalization performance of self-supervised learning baselines.
arXiv Detail & Related papers (2022-10-11T08:26:19Z) - Pseudo Bias-Balanced Learning for Debiased Chest X-ray Classification [57.53567756716656]
We study the problem of developing debiased chest X-ray diagnosis models without knowing exactly the bias labels.
We propose a novel algorithm, pseudo bias-balanced learning, which first captures and predicts per-sample bias labels.
Our proposed method achieved consistent improvements over other state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-18T11:02:18Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system.
Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model.
We evaluate our framework on a large-scale public-available skin lesion dataset.
arXiv Detail & Related papers (2021-03-07T03:10:32Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.