Benchmarking Differential Privacy and Federated Learning for BERT Models
- URL: http://arxiv.org/abs/2106.13973v1
- Date: Sat, 26 Jun 2021 08:52:02 GMT
- Title: Benchmarking Differential Privacy and Federated Learning for BERT Models
- Authors: Priyam Basu, Tiasa Singha Roy, Rakshit Naidu, Zumrut Muftuoglu, Sahib
Singh, Fatemehsadat Mireshghallah
- Abstract summary: Depression is a serious medical illness that can have adverse effects on how one feels, thinks, and acts.
Due to the sensitive nature of such data, privacy measures need to be taken for handling and training models with such data.
We study the effects that the application of Differential Privacy (DP) has, in both a centralized and a Federated Learning (FL) setup, on training contextualized language models.
- Score: 0.6524460254566904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Natural Language Processing (NLP) techniques can be applied to help with the
diagnosis of medical conditions such as depression, using a collection of a
person's utterances. Depression is a serious medical illness that can have
adverse effects on how one feels, thinks, and acts, which can lead to emotional
and physical problems. Due to the sensitive nature of such data, privacy
measures need to be taken for handling and training models with such data. In
this work, we study the effects that the application of Differential Privacy
(DP) has, in both a centralized and a Federated Learning (FL) setup, on
training contextualized language models (BERT, ALBERT, RoBERTa and DistilBERT).
We offer insights on how to privately train NLP models and what architectures
and setups provide more desirable privacy utility trade-offs. We envisage this
work to be used in future healthcare and mental health studies to keep medical
history private. Therefore, we provide an open-source implementation of this
work.
Related papers
- Towards Privacy-aware Mental Health AI Models: Advances, Challenges, and Opportunities [61.633126163190724]
Mental illness is a widespread and debilitating condition with substantial societal and personal costs.
Recent advances in Artificial Intelligence (AI) hold great potential for recognizing and addressing conditions such as depression, anxiety disorder, bipolar disorder, schizophrenia, and post-traumatic stress disorder.
Privacy concerns, including the risk of sensitive data leakage from datasets and trained models, remain a critical barrier to deploying these AI systems in real-world clinical settings.
arXiv Detail & Related papers (2025-02-01T15:10:02Z) - MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders [59.515827458631975]
Mental health disorders are one of the most serious diseases in the world.
Privacy concerns limit the accessibility of personalized treatment data.
MentalArena is a self-play framework to train language models.
arXiv Detail & Related papers (2024-10-09T13:06:40Z) - CASE: Efficient Curricular Data Pre-training for Building Assistive Psychology Expert Models [1.0840985826142429]
This study explores the use of Natural Language Processing (NLP) pipelines to analyze text data from online mental health forums used for consultations.
By analyzing forum posts, these pipelines can flag users who may require immediate professional attention.
Case-BERT demonstrates superior performance compared to existing methods, achieving an f1 score of 0.91 for Depression and 0.88 for Anxiety.
arXiv Detail & Related papers (2024-06-01T06:17:32Z) - Vision Through the Veil: Differential Privacy in Federated Learning for
Medical Image Classification [15.382184404673389]
The proliferation of deep learning applications in healthcare calls for data aggregation across various institutions.
Privacy-preserving mechanisms are paramount in medical image analysis, where the data being sensitive in nature.
This study addresses the need by integrating differential privacy, a leading privacy-preserving technique, into a federated learning framework for medical image classification.
arXiv Detail & Related papers (2023-06-30T16:48:58Z) - Privacy-preserving machine learning for healthcare: open challenges and
future perspectives [72.43506759789861]
We conduct a review of recent literature concerning Privacy-Preserving Machine Learning (PPML) for healthcare.
We primarily focus on privacy-preserving training and inference-as-a-service.
The aim of this review is to guide the development of private and efficient ML models in healthcare.
arXiv Detail & Related papers (2023-03-27T19:20:51Z) - GDPR Compliant Collection of Therapist-Patient-Dialogues [48.091760741427656]
We elaborate on the challenges we faced in starting our collection of therapist-patient dialogues in a psychiatry clinic under the General Data Privacy Regulation of the European Union.
We give an overview of each step in our procedure and point out the potential pitfalls to motivate further research in this field.
arXiv Detail & Related papers (2022-11-22T15:51:10Z) - Practical Challenges in Differentially-Private Federated Survival
Analysis of Medical Data [57.19441629270029]
In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of survival analysis models.
In the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge.
We propose DPFed-post which adds a post-processing stage to the private federated learning scheme.
arXiv Detail & Related papers (2022-02-08T10:03:24Z) - Learning Language and Multimodal Privacy-Preserving Markers of Mood from
Mobile Data [74.60507696087966]
Mental health conditions remain underdiagnosed even in countries with common access to advanced medical care.
One promising data source to help monitor human behavior is daily smartphone usage.
We study behavioral markers of daily mood using a recent dataset of mobile behaviors from adolescent populations at high risk of suicidal behaviors.
arXiv Detail & Related papers (2021-06-24T17:46:03Z) - FedMood:Federated Learning on Mobile Health Data for Mood Detection [26.263092039195786]
Depression is one of the most common mental illness problems.
Traditional centralized machine learning needs to aggregate patient data.
Data privacy of patients with mental illness needs to be strictly confidential.
arXiv Detail & Related papers (2021-02-06T15:19:08Z) - MET: Multimodal Perception of Engagement for Telehealth [52.54282887530756]
We present MET, a learning-based algorithm for perceiving a human's level of engagement from videos.
We release a new dataset, MEDICA, for mental health patient engagement detection.
arXiv Detail & Related papers (2020-11-17T15:18:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.