Can large language models be privacy preserving and fair medical coders?
- URL: http://arxiv.org/abs/2412.05533v1
- Date: Sat, 07 Dec 2024 04:27:05 GMT
- Title: Can large language models be privacy preserving and fair medical coders?
- Authors: Ali Dadsetan, Dorsa Soleymani, Xijie Zeng, Frank Rudzicz,
- Abstract summary: Differential privacy (DP) is a common method for preserving privacy in such settings.
We examine two key trade-offs in applying DP to the NLP task of medical coding (ICD classification)
- Score: 13.49769820767045
- License:
- Abstract: Protecting patient data privacy is a critical concern when deploying machine learning algorithms in healthcare. Differential privacy (DP) is a common method for preserving privacy in such settings and, in this work, we examine two key trade-offs in applying DP to the NLP task of medical coding (ICD classification). Regarding the privacy-utility trade-off, we observe a significant performance drop in the privacy preserving models, with more than a 40% reduction in micro F1 scores on the top 50 labels in the MIMIC-III dataset. From the perspective of the privacy-fairness trade-off, we also observe an increase of over 3% in the recall gap between male and female patients in the DP models. Further understanding these trade-offs will help towards the challenges of real-world deployment.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Differential privacy enables fair and accurate AI-based analysis of speech disorders while protecting patient data [10.6135892856374]
This study is the first to investigate differential privacy (DP)'s impact on pathological speech data.
We observed a maximum accuracy reduction of 3.85% when training with DP with high privacy levels.
To generalize our findings across languages and disorders, we validated our approach on a dataset of Spanish-speaking Parkinson's disease patients.
arXiv Detail & Related papers (2024-09-27T18:25:54Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - TeD-SPAD: Temporal Distinctiveness for Self-supervised
Privacy-preservation for video Anomaly Detection [59.04634695294402]
Video anomaly detection (VAD) without human monitoring is a complex computer vision task.
Privacy leakage in VAD allows models to pick up and amplify unnecessary biases related to people's personal information.
We propose TeD-SPAD, a privacy-aware video anomaly detection framework that destroys visual private information in a self-supervised manner.
arXiv Detail & Related papers (2023-08-21T22:42:55Z) - Client-Level Differential Privacy via Adaptive Intermediary in Federated
Medical Imaging [33.494287036763716]
Trade-off of differential privacy (DP) between privacy protection and performance is still underexplored for real-world medical scenario.
We propose to optimize the trade-off under the context of client-level DP, which focuses on privacy during communications.
We propose an adaptive intermediary strategy to improve performance without harming privacy.
arXiv Detail & Related papers (2023-07-24T06:12:37Z) - Private, fair and accurate: Training large-scale, privacy-preserving AI models in medical imaging [47.99192239793597]
We evaluated the effect of privacy-preserving training of AI models regarding accuracy and fairness compared to non-private training.
Our study shows that -- under the challenging realistic circumstances of a real-life clinical dataset -- the privacy-preserving training of diagnostic deep learning models is possible with excellent diagnostic accuracy and fairness.
arXiv Detail & Related papers (2023-02-03T09:49:13Z) - Just Fine-tune Twice: Selective Differential Privacy for Large Language
Models [69.66654761324702]
We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models.
Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
arXiv Detail & Related papers (2022-04-15T22:36:55Z) - Production of Categorical Data Verifying Differential Privacy:
Conception and Applications to Machine Learning [0.0]
Differential privacy is a formal definition that allows quantifying the privacy-utility trade-off.
With the local DP (LDP) model, users can sanitize their data locally before transmitting it to the server.
In all cases, we concluded that differentially private ML models achieve nearly the same utility metrics as non-private ones.
arXiv Detail & Related papers (2022-04-02T12:50:14Z) - Chasing Your Long Tails: Differentially Private Prediction in Health
Care Settings [34.26542589537452]
Methods for differentially private (DP) learning provide a general-purpose approach to learn models with privacy guarantees.
Modern methods for DP learning ensure privacy through mechanisms that censor information judged as too unique.
We use state-of-the-art methods for DP learning to train privacy-preserving models in clinical prediction tasks.
arXiv Detail & Related papers (2020-10-13T19:56:37Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.