Differential privacy for protecting patient data in speech disorder detection using deep learning
- URL: http://arxiv.org/abs/2409.19078v1
- Date: Fri, 27 Sep 2024 18:25:54 GMT
- Title: Differential privacy for protecting patient data in speech disorder detection using deep learning
- Authors: Soroosh Tayebi Arasteh, Mahshad Lotfinia, Paula Andrea Perez-Toro, Tomas Arias-Vergara, Juan Rafael Orozco-Arroyave, Maria Schuster, Andreas Maier, Seung Hee Yang,
- Abstract summary: This study is the first to investigate differential privacy (DP)'s impact on pathological speech data.
We observed a maximum accuracy reduction of 3.85% when training with DP with a privacy budget of 7.51.
To generalize our findings, we validated our approach on a smaller dataset of Spanish-speaking Parkinson's disease patients.
- Score: 11.01272267983849
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Speech pathology has impacts on communication abilities and quality of life. While deep learning-based models have shown potential in diagnosing these disorders, the use of sensitive data raises critical privacy concerns. Although differential privacy (DP) has been explored in the medical imaging domain, its application in pathological speech analysis remains largely unexplored despite the equally critical privacy concerns. This study is the first to investigate DP's impact on pathological speech data, focusing on the trade-offs between privacy, diagnostic accuracy, and fairness. Using a large, real-world dataset of 200 hours of recordings from 2,839 German-speaking participants, we observed a maximum accuracy reduction of 3.85% when training with DP with a privacy budget, denoted by {\epsilon}, of 7.51. To generalize our findings, we validated our approach on a smaller dataset of Spanish-speaking Parkinson's disease patients, demonstrating that careful pretraining on large-scale task-specific datasets can maintain or even improve model accuracy under DP constraints. We also conducted a comprehensive fairness analysis, revealing that reasonable privacy levels (2<{\epsilon}<10) do not introduce significant gender bias, though age-related disparities may require further attention. Our results suggest that DP can effectively balance privacy and utility in speech disorder detection, but also highlight the unique challenges in the speech domain, particularly regarding the privacy-fairness trade-off. This provides a foundation for future work to refine DP methodologies and address fairness across diverse patient groups in real-world deployments.
Related papers
- Towards Privacy-aware Mental Health AI Models: Advances, Challenges, and Opportunities [61.633126163190724]
Mental illness is a widespread and debilitating condition with substantial societal and personal costs.
Recent advances in Artificial Intelligence (AI) hold great potential for recognizing and addressing conditions such as depression, anxiety disorder, bipolar disorder, schizophrenia, and post-traumatic stress disorder.
Privacy concerns, including the risk of sensitive data leakage from datasets and trained models, remain a critical barrier to deploying these AI systems in real-world clinical settings.
arXiv Detail & Related papers (2025-02-01T15:10:02Z) - Privacy-Preserving Heterogeneous Federated Learning for Sensitive Healthcare Data [12.30620268528346]
We propose a new framework termed Abstention-Aware Federated Voting (AAFV)
AAFV can collaboratively and confidentially train heterogeneous local models while simultaneously protecting the data privacy.
In particular, the proposed abstention-aware voting mechanism exploits a threshold-based abstention method to select high-confidence votes from heterogeneous local models.
arXiv Detail & Related papers (2024-06-15T08:43:40Z) - The Impact of Speech Anonymization on Pathology and Its Limits [6.687980678659259]
This study investigates anonymization's impact on pathological speech across over 2,700 speakers from multiple German institutions.
Specific disorders such as Dysarthria, Dysphonia, and Cleft Lip and Palate experienced minimal utility changes, while Dysglossia showed slight improvements.
arXiv Detail & Related papers (2024-04-11T18:06:35Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Private, fair and accurate: Training large-scale, privacy-preserving AI models in medical imaging [47.99192239793597]
We evaluated the effect of privacy-preserving training of AI models regarding accuracy and fairness compared to non-private training.
Our study shows that -- under the challenging realistic circumstances of a real-life clinical dataset -- the privacy-preserving training of diagnostic deep learning models is possible with excellent diagnostic accuracy and fairness.
arXiv Detail & Related papers (2023-02-03T09:49:13Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - The effect of speech pathology on automatic speaker verification -- a
large-scale study [6.468412158245622]
pathological speech faces heightened privacy breach risks compared to healthy speech.
Adults with dysphonia are at heightened re-identification risks, whereas conditions like dysarthria yield results comparable to those of healthy speakers.
Merging data across pathological types led to a marked EER decrease, suggesting the potential benefits of pathological diversity in automatic speaker verification.
arXiv Detail & Related papers (2022-04-13T15:17:00Z) - Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions.
We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z) - Differentially Private and Fair Deep Learning: A Lagrangian Dual
Approach [54.32266555843765]
This paper studies a model that protects the privacy of the individuals sensitive information while also allowing it to learn non-discriminatory predictors.
The method relies on the notion of differential privacy and the use of Lagrangian duality to design neural networks that can accommodate fairness constraints.
arXiv Detail & Related papers (2020-09-26T10:50:33Z) - Anonymizing Data for Privacy-Preserving Federated Learning [3.3673553810697827]
We propose the first syntactic approach for offering privacy in the context of federated learning.
Our approach aims to maximize utility or model performance, while supporting a defensible level of privacy.
We perform a comprehensive empirical evaluation on two important problems in the healthcare domain, using real-world electronic health data of 1 million patients.
arXiv Detail & Related papers (2020-02-21T02:30:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.