Fairness in Survival Analysis with Distributionally Robust Optimization
- URL: http://arxiv.org/abs/2409.10538v1
- Date: Sat, 31 Aug 2024 15:03:20 GMT
- Title: Fairness in Survival Analysis with Distributionally Robust Optimization
- Authors: Shu Hu, George H. Chen,
- Abstract summary: We propose a general approach for encouraging fairness in survival analysis models based on minimizing a worst-case error across all subpopulations.
This approach can be used to convert many existing survival analysis models into ones that simultaneously encourage fairness.
- Score: 13.159777131162965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a general approach for encouraging fairness in survival analysis models based on minimizing a worst-case error across all subpopulations that occur with at least a user-specified probability. This approach can be used to convert many existing survival analysis models into ones that simultaneously encourage fairness, without requiring the user to specify which attributes or features to treat as sensitive in the training loss function. From a technical standpoint, our approach applies recent developments of distributionally robust optimization (DRO) to survival analysis. The complication is that existing DRO theory uses a training loss function that decomposes across contributions of individual data points, i.e., any term that shows up in the loss function depends only on a single training point. This decomposition does not hold for commonly used survival loss functions, including for the Cox proportional hazards model, its deep neural network variants, and many other recently developed models that use loss functions involving ranking or similarity score calculations. We address this technical hurdle using a sample splitting strategy. We demonstrate our sample splitting DRO approach by using it to create fair versions of a diverse set of existing survival analysis models including the Cox model (and its deep variant DeepSurv), the discrete-time model DeepHit, and the neural ODE model SODEN. We also establish a finite-sample theoretical guarantee to show what our sample splitting DRO loss converges to. For the Cox model, we further derive an exact DRO approach that does not use sample splitting. For all the models that we convert into DRO variants, we show that the DRO variants often score better on recently established fairness metrics (without incurring a significant drop in accuracy) compared to existing survival analysis fairness regularization techniques.
Related papers
- Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Variational Deep Survival Machines: Survival Regression with Censored Outcomes [11.82370259688716]
Survival regression aims to predict the time when an event of interest will take place, typically a death or a failure.
We present a novel method to predict the survival time by better clustering the survival data and combine primitive distributions.
arXiv Detail & Related papers (2024-04-24T02:16:00Z) - Distributionally Robust Multiclass Classification and Applications in
Deep Image Classifiers [9.979945269265627]
We develop a Distributionally Robust Optimization (DRO) formulation for Multiclass Logistic Regression (MLR)
We demonstrate reductions in test error rate by up to 83.5% and loss by up to 91.3% compared with baseline methods, by adopting a novel random training method.
arXiv Detail & Related papers (2022-10-15T05:09:28Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Distributionally Robust Multiclass Classification and Applications in
Deep Image Classifiers [3.179831861897336]
We develop a Distributionally Robust Optimization (DRO) formulation for Multiclass Logistic Regression (MLR)
We demonstrate reductions in test error rate by up to 83.5% and loss by up to 91.3% compared with baseline methods, by adopting a novel random training method.
arXiv Detail & Related papers (2021-09-27T02:58:19Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Contrastive Model Inversion for Data-Free Knowledge Distillation [60.08025054715192]
We propose Contrastive Model Inversion, where the data diversity is explicitly modeled as an optimizable objective.
Our main observation is that, under the constraint of the same amount of data, higher data diversity usually indicates stronger instance discrimination.
Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet demonstrate that CMI achieves significantly superior performance when the generated data are used for knowledge distillation.
arXiv Detail & Related papers (2021-05-18T15:13:00Z) - Federated Survival Analysis with Discrete-Time Cox Models [0.46331617589391827]
We build machine learning models from decentralized datasets located in different centers with federated learning (FL)
We show that the resulting model may suffer from important performance loss in some adverse settings.
Using this approach, we train survival models using standard FL techniques on synthetic data, as well as real-world datasets from The Cancer Genome Atlas (TCGA)
arXiv Detail & Related papers (2020-06-16T08:53:19Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.