Preserving privacy in domain transfer of medical AI models comes at no
performance costs: The integral role of differential privacy
- URL: http://arxiv.org/abs/2306.06503v2
- Date: Thu, 7 Dec 2023 18:36:31 GMT
- Title: Preserving privacy in domain transfer of medical AI models comes at no
performance costs: The integral role of differential privacy
- Authors: Soroosh Tayebi Arasteh, Mahshad Lotfinia, Teresa Nolte, Marwin Saehn,
Peter Isfort, Christiane Kuhl, Sven Nebelung, Georgios Kaissis, Daniel Truhn
- Abstract summary: We evaluate the efficacy of DP-enhanced domain transfer (DP-DT) in diagnosing cardiomegaly, pleural effusion, pneumonia, atelectasis, and in identifying healthy subjects.
Our results show that DP-DT, even with exceptionally high privacy levels, performs comparably to non-DP-DT.
- Score: 5.025818976218807
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Developing robust and effective artificial intelligence (AI) models in
medicine requires access to large amounts of patient data. The use of AI models
solely trained on large multi-institutional datasets can help with this, yet
the imperative to ensure data privacy remains, particularly as membership
inference risks breaching patient confidentiality. As a proposed remedy, we
advocate for the integration of differential privacy (DP). We specifically
investigate the performance of models trained with DP as compared to models
trained without DP on data from institutions that the model had not seen during
its training (i.e., external validation) - the situation that is reflective of
the clinical use of AI models. By leveraging more than 590,000 chest
radiographs from five institutions, we evaluated the efficacy of DP-enhanced
domain transfer (DP-DT) in diagnosing cardiomegaly, pleural effusion,
pneumonia, atelectasis, and in identifying healthy subjects. We juxtaposed
DP-DT with non-DP-DT and examined diagnostic accuracy and demographic fairness
using the area under the receiver operating characteristic curve (AUC) as the
main metric, as well as accuracy, sensitivity, and specificity. Our results
show that DP-DT, even with exceptionally high privacy levels (epsilon around
1), performs comparably to non-DP-DT (P>0.119 across all domains). Furthermore,
DP-DT led to marginal AUC differences - less than 1% - for nearly all
subgroups, relative to non-DP-DT. Despite consistent evidence suggesting that
DP models induce significant performance degradation for on-domain
applications, we show that off-domain performance is almost not affected.
Therefore, we ardently advocate for the adoption of DP in training diagnostic
medical AI models, given its minimal impact on performance.
Related papers
- Weakly supervised deep learning model with size constraint for prostate cancer detection in multiparametric MRI and generalization to unseen domains [0.90668179713299]
We show that the model achieves on-par performance with strong fully supervised baseline models.
We also observe a performance decrease for both fully supervised and weakly supervised models when tested on unseen data domains.
arXiv Detail & Related papers (2024-11-04T12:24:33Z) - Domain Adaptation of Echocardiography Segmentation Via Reinforcement Learning [4.850478245721347]
We introduce RL4Seg, an innovative reinforcement learning framework that reduces the need to otherwise incorporate large expertly annotated datasets in the target domain.
Using a target dataset of 10,000 unannotated 2D echocardiographic images, RL4Seg achieves 99% anatomical validity on a subset of 220 expert-validated subjects from the target domain.
arXiv Detail & Related papers (2024-06-25T19:26:39Z) - Pre-training Differentially Private Models with Limited Public Data [54.943023722114134]
differential privacy (DP) is a prominent method to gauge the degree of security provided to the models.
DP is yet not capable of protecting a substantial portion of the data used during the initial pre-training stage.
We develop a novel DP continual pre-training strategy using only 10% of public data.
Our strategy can achieve DP accuracy of 41.5% on ImageNet-21k, as well as non-DP accuracy of 55.7% and and 60.0% on downstream tasks Places365 and iNaturalist-2021.
arXiv Detail & Related papers (2024-02-28T23:26:27Z) - Privacy Constrained Fairness Estimation for Decision Trees [2.9906966931843093]
Measuring the fairness of any AI model requires the sensitive attributes of the individuals in the dataset.
We propose a novel method, dubbed Privacy-Aware Fairness Estimation of Rules (PAFER)
We show that using the Laplacian mechanism, the method is able to estimate SP with low error while guaranteeing the privacy of the individuals in the dataset with high certainty.
arXiv Detail & Related papers (2023-12-13T14:54:48Z) - Reconciling AI Performance and Data Reconstruction Resilience for
Medical Imaging [52.578054703818125]
Artificial Intelligence (AI) models are vulnerable to information leakage of their training data, which can be highly sensitive.
Differential Privacy (DP) aims to circumvent these susceptibilities by setting a quantifiable privacy budget.
We show that using very large privacy budgets can render reconstruction attacks impossible, while drops in performance are negligible.
arXiv Detail & Related papers (2023-12-05T12:21:30Z) - Dual-Reference Source-Free Active Domain Adaptation for Nasopharyngeal
Carcinoma Tumor Segmentation across Multiple Hospitals [9.845637899896365]
Nasopharyngeal carcinoma (NPC) is a prevalent and clinically significant malignancy that predominantly impacts the head and neck area.
We propose a novel Sourece-Free Active Domain Adaptation (SFADA) framework to facilitate domain adaptation for the Gross Tumor Volume (GTV) segmentation task.
We collect a large-scale clinical dataset comprising 1057 NPC patients from five hospitals to validate our approach.
arXiv Detail & Related papers (2023-09-23T15:26:27Z) - Private, fair and accurate: Training large-scale, privacy-preserving AI models in medical imaging [47.99192239793597]
We evaluated the effect of privacy-preserving training of AI models regarding accuracy and fairness compared to non-private training.
Our study shows that -- under the challenging realistic circumstances of a real-life clinical dataset -- the privacy-preserving training of diagnostic deep learning models is possible with excellent diagnostic accuracy and fairness.
arXiv Detail & Related papers (2023-02-03T09:49:13Z) - Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in
Artificial Intelligence [79.038671794961]
We launch the Unified CT-COVID AI Diagnostic Initiative (UCADI), where the AI model can be distributedly trained and independently executed at each host institution.
Our study is based on 9,573 chest computed tomography scans (CTs) from 3,336 patients collected from 23 hospitals located in China and the UK.
arXiv Detail & Related papers (2021-11-18T00:43:41Z) - Differentially private federated deep learning for multi-site medical
image segmentation [56.30543374146002]
Collaborative machine learning techniques such as federated learning (FL) enable the training of models on effectively larger datasets without data transfer.
Recent initiatives have demonstrated that segmentation models trained with FL can achieve performance similar to locally trained models.
However, FL is not a fully privacy-preserving technique and privacy-centred attacks can disclose confidential patient data.
arXiv Detail & Related papers (2021-07-06T12:57:32Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.