Bias and Generalizability of Foundation Models across Datasets in Breast Mammography
- URL: http://arxiv.org/abs/2505.10579v2
- Date: Mon, 19 May 2025 07:22:12 GMT
- Title: Bias and Generalizability of Foundation Models across Datasets in Breast Mammography
- Authors: Elodie Germani, Ilayda Selin Türk, Fatima Zeineddine, Charbel Mourad, Shadi Albarqouni,
- Abstract summary: We explore the fairness and bias of foundation models (FMs) for breast mammography classification.<n>We leverage a large pool of datasets from diverse sources-including data from underrepresented regions and an in-house dataset.<n>Our experiments show that while modality-specific pre-training of FMs enhances performance, classifiers trained on features from individual datasets fail to generalize across domains.
- Score: 4.117899774444893
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Over the past decades, computer-aided diagnosis tools for breast cancer have been developed to enhance screening procedures, yet their clinical adoption remains challenged by data variability and inherent biases. Although foundation models (FMs) have recently demonstrated impressive generalizability and transfer learning capabilities by leveraging vast and diverse datasets, their performance can be undermined by spurious correlations that arise from variations in image quality, labeling uncertainty, and sensitive patient attributes. In this work, we explore the fairness and bias of FMs for breast mammography classification by leveraging a large pool of datasets from diverse sources-including data from underrepresented regions and an in-house dataset. Our extensive experiments show that while modality-specific pre-training of FMs enhances performance, classifiers trained on features from individual datasets fail to generalize across domains. Aggregating datasets improves overall performance, yet does not fully mitigate biases, leading to significant disparities across under-represented subgroups such as extreme breast densities and age groups. Furthermore, while domain-adaptation strategies can reduce these disparities, they often incur a performance trade-off. In contrast, fairness-aware techniques yield more stable and equitable performance across subgroups. These findings underscore the necessity of incorporating rigorous fairness evaluations and mitigation strategies into FM-based models to foster inclusive and generalizable AI.
Related papers
- Benchmarking Foundation Models for Mitotic Figure Classification [0.37334049820361814]
Self-supervised learning techniques have enabled the use of vast amounts of unlabeled data to train large-scale neural networks.<n>In this work, we investigate the use of foundation models for mitotic figure classification.<n>We compare all models against end-to-end-trained baselines, both CNNs and Vision Transformers.
arXiv Detail & Related papers (2025-08-06T13:30:40Z) - Evaluating Facial Expression Recognition Datasets for Deep Learning: A Benchmark Study with Novel Similarity Metrics [4.137346786534721]
This study investigates the key characteristics and suitability of widely used Facial Expression Recognition (FER) datasets for training deep learning models.<n>We compiled and analyzed 24 FER datasets, including those targeting specific age groups such as children, adults, and the elderly.<n> Benchmark experiments using state-of-the-art neural networks reveal that large-scale, automatically collected datasets tend to generalize better.
arXiv Detail & Related papers (2025-03-26T11:01:00Z) - Revisiting Automatic Data Curation for Vision Foundation Models in Digital Pathology [41.34847597178388]
Vision foundation models (FMs) learn to represent histological features in highly heterogeneous tiles extracted from whole-slide images.<n>We investigate the potential of unsupervised automatic data curation at the tile-level, taking into account 350 million tiles.
arXiv Detail & Related papers (2025-03-24T14:23:48Z) - Data-Driven Fairness Generalization for Deepfake Detection [1.2221087476416053]
biases in the training data for deepfake detection can result in varying levels of performance across different demographic groups.<n>We propose a data-driven framework for tackling the fairness generalization problem in deepfake detection by leveraging synthetic datasets and model optimization.
arXiv Detail & Related papers (2024-12-21T01:28:35Z) - FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models [37.803490266325]
We introduce FairMedFM, a fairness benchmark for foundation models (FMs) research in medical imaging.<n>FairMedFM integrates with 17 popular medical imaging datasets, encompassing different modalities, dimensionalities, and sensitive attributes.<n>It explores 20 widely used FMs, with various usages such as zero-shot learning, linear probing, parameter-efficient fine-tuning, and prompting in various downstream tasks -- classification and segmentation.
arXiv Detail & Related papers (2024-07-01T05:47:58Z) - Counterfactual Fairness through Transforming Data Orthogonal to Bias [7.109458605736819]
We propose a novel data pre-processing algorithm, Orthogonal to Bias (OB)
OB is designed to eliminate the influence of a group of continuous sensitive variables, thus promoting counterfactual fairness in machine learning applications.
OB is model-agnostic, making it applicable to a wide range of machine learning models and tasks.
arXiv Detail & Related papers (2024-03-26T16:40:08Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Class-Balancing Diffusion Models [57.38599989220613]
Class-Balancing Diffusion Models (CBDM) are trained with a distribution adjustment regularizer as a solution.
Our method benchmarked the generation results on CIFAR100/CIFAR100LT dataset and shows outstanding performance on the downstream recognition task.
arXiv Detail & Related papers (2023-04-30T20:00:14Z) - Generative models improve fairness of medical classifiers under
distribution shifts [49.10233060774818]
We show that learning realistic augmentations automatically from data is possible in a label-efficient manner using generative models.
We demonstrate that these learned augmentations can surpass ones by making models more robust and statistically fair in- and out-of-distribution.
arXiv Detail & Related papers (2023-04-18T18:15:38Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - Enhancing Facial Data Diversity with Style-based Face Aging [59.984134070735934]
In particular, face datasets are typically biased in terms of attributes such as gender, age, and race.
We propose a novel, generative style-based architecture for data augmentation that captures fine-grained aging patterns.
We show that the proposed method outperforms state-of-the-art algorithms for age transfer.
arXiv Detail & Related papers (2020-06-06T21:53:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.