On the Impact of Random Seeds on the Fairness of Clinical Classifiers
- URL: http://arxiv.org/abs/2104.06338v1
- Date: Tue, 13 Apr 2021 16:30:39 GMT
- Title: On the Impact of Random Seeds on the Fairness of Clinical Classifiers
- Authors: Silvio Amir and Jan-Willem van de Meent and Byron C. Wallace
- Abstract summary: We explore the implications of this phenomenon for model fairness across demographic groups in clinical prediction tasks over electronic health records.
We also find that the small sample sizes inherent to looking at intersections of minority groups and somewhat rare conditions limit our ability to accurately estimate disparities.
- Score: 27.71610203951057
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent work has shown that fine-tuning large networks is surprisingly
sensitive to changes in random seed(s). We explore the implications of this
phenomenon for model fairness across demographic groups in clinical prediction
tasks over electronic health records (EHR) in MIMIC-III -- the standard dataset
in clinical NLP research. Apparent subgroup performance varies substantially
for seeds that yield similar overall performance, although there is no evidence
of a trade-off between overall and subgroup performance. However, we also find
that the small sample sizes inherent to looking at intersections of minority
groups and somewhat rare conditions limit our ability to accurately estimate
disparities. Further, we find that jointly optimizing for high overall
performance and low disparities does not yield statistically significant
improvements. Our results suggest that fairness work using MIMIC-III should
carefully account for variations in apparent differences that may arise from
stochasticity and small sample sizes.
Related papers
- How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance [64.1656365676171]
Group imbalance has been a known problem in empirical risk minimization.
This paper quantifies the impact of individual groups on the sample complexity, the convergence rate, and the average and group-level testing performance.
arXiv Detail & Related papers (2024-03-12T04:38:05Z) - GroupMixNorm Layer for Learning Fair Models [4.324785083027206]
This research proposes a novel in-processing based GroupMixNorm layer for mitigating bias from deep learning models.
The proposed method improves upon several fairness metrics with minimal impact on overall accuracy.
arXiv Detail & Related papers (2023-12-19T09:04:26Z) - Auditing ICU Readmission Rates in an Clinical Database: An Analysis of
Risk Factors and Clinical Outcomes [0.0]
This study presents a machine learning pipeline for clinical data classification in the context of a 30-day readmission problem.
The fairness audit uncovers disparities in equal opportunity, predictive parity, false positive rate parity, and false negative rate parity criteria.
The study suggests the need for collaborative efforts among researchers, policymakers, and practitioners to address bias and fairness in artificial intelligence (AI) systems.
arXiv Detail & Related papers (2023-04-12T17:09:38Z) - Rethinking Semi-Supervised Medical Image Segmentation: A
Variance-Reduction Perspective [51.70661197256033]
We propose ARCO, a semi-supervised contrastive learning framework with stratified group theory for medical image segmentation.
We first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks.
We experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings.
arXiv Detail & Related papers (2023-02-03T13:50:25Z) - Systematic Evaluation of Predictive Fairness [60.0947291284978]
Mitigating bias in training on biased datasets is an important open problem.
We examine the performance of various debiasing methods across multiple tasks.
We find that data conditions have a strong influence on relative model performance.
arXiv Detail & Related papers (2022-10-17T05:40:13Z) - RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and
Out Distribution Robustness [94.69774317059122]
We show that the effectiveness of the well celebrated Mixup can be further improved if instead of using it as the sole learning objective, it is utilized as an additional regularizer to the standard cross-entropy loss.
This simple change not only provides much improved accuracy but also significantly improves the quality of the predictive uncertainty estimation of Mixup.
arXiv Detail & Related papers (2022-06-29T09:44:33Z) - How does overparametrization affect performance on minority groups? [39.54853544590893]
We show that over parameterization always improves minority group performance.
In a setting in which the regression functions for the majority and minority groups are different, we show that over parameterization always improves minority group performance.
arXiv Detail & Related papers (2022-06-07T18:00:52Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Self-Diagnosing GAN: Diagnosing Underrepresented Samples in Generative
Adversarial Networks [5.754152248672317]
We propose a method to diagnose and emphasize underrepresented samples during training of a Generative Adversarial Networks (GAN)
Based on the observation that the underrepresented samples have a high average discrepancy or high variability in discrepancy, we propose a method to emphasize those samples.
Our experimental results demonstrate that the proposed method improves GAN performance on various datasets.
arXiv Detail & Related papers (2021-02-24T02:31:50Z) - An Investigation of Why Overparameterization Exacerbates Spurious
Correlations [98.3066727301239]
We identify two key properties of the training data that drive this behavior.
We show how the inductive bias of models towards "memorizing" fewer examples can cause over parameterization to hurt.
arXiv Detail & Related papers (2020-05-09T01:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.