How Much User Context Do We Need? Privacy by Design in Mental Health NLP
Application
- URL: http://arxiv.org/abs/2209.02022v1
- Date: Mon, 5 Sep 2022 15:41:45 GMT
- Title: How Much User Context Do We Need? Privacy by Design in Mental Health NLP
Application
- Authors: Ramit Sawhney and Atula Tejaswi Neerkaje and Ivan Habernal and Lucie
Flek
- Abstract summary: Clinical tasks such as mental health assessment from text must take social constraints into account.
We present first analysis juxtaposing user history length and differential privacy budgets and elaborate how modeling additional user context enables utility preservation.
- Score: 33.3172788815152
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Clinical NLP tasks such as mental health assessment from text, must take
social constraints into account - the performance maximization must be
constrained by the utmost importance of guaranteeing privacy of user data.
Consumer protection regulations, such as GDPR, generally handle privacy by
restricting data availability, such as requiring to limit user data to 'what is
necessary' for a given purpose. In this work, we reason that providing stricter
formal privacy guarantees, while increasing the volume of user data in the
model, in most cases increases benefit for all parties involved, especially for
the user. We demonstrate our arguments on two existing suicide risk assessment
datasets of Twitter and Reddit posts. We present the first analysis juxtaposing
user history length and differential privacy budgets and elaborate how modeling
additional user context enables utility preservation while maintaining
acceptable user privacy guarantees.
Related papers
- PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.
We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.
State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - Collection, usage and privacy of mobility data in the enterprise and public administrations [55.2480439325792]
Security measures such as anonymization are needed to protect individuals' privacy.
Within our study, we conducted expert interviews to gain insights into practices in the field.
We survey privacy-enhancing methods in use, which generally do not comply with state-of-the-art standards of differential privacy.
arXiv Detail & Related papers (2024-07-04T08:29:27Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Mean Estimation Under Heterogeneous Privacy Demands [5.755004576310333]
This work considers the problem of mean estimation, where each user can impose their own privacy level.
The algorithm we propose is shown to be minimax optimal and has a near-linear run-time.
Users with less but differing privacy requirements are all given more privacy than they require, in equal amounts.
arXiv Detail & Related papers (2023-10-19T20:29:19Z) - Privacy Preserving Large Language Models: ChatGPT Case Study Based Vision and Framework [6.828884629694705]
This article proposes the conceptual model called PrivChatGPT, a privacy-generative model for LLMs.
PrivChatGPT consists of two main components i.e., preserving user privacy during the data curation/pre-processing together with preserving private context and the private training process for large-scale data.
arXiv Detail & Related papers (2023-10-19T06:55:13Z) - Mean Estimation Under Heterogeneous Privacy: Some Privacy Can Be Free [13.198689566654103]
This work considers the problem of mean estimation under heterogeneous Differential Privacy constraints.
The algorithm we propose is shown to be minimax optimal when there are two groups of users with distinct privacy levels.
arXiv Detail & Related papers (2023-04-27T05:23:06Z) - Privacy Explanations - A Means to End-User Trust [64.7066037969487]
We looked into how explainability might help to tackle this problem.
We created privacy explanations that aim to help to clarify to end users why and for what purposes specific data is required.
Our findings reveal that privacy explanations can be an important step towards increasing trust in software systems.
arXiv Detail & Related papers (2022-10-18T09:30:37Z) - Leveraging Privacy Profiles to Empower Users in the Digital Society [7.350403786094707]
Privacy and ethics of citizens are at the core of the concerns raised by our increasingly digital society.
We focus on the privacy dimension and contribute a step in the above direction through an empirical study on an existing dataset collected from the fitness domain.
The results reveal that a compact set of semantic-driven questions helps distinguish users better than a complex domain-dependent one.
arXiv Detail & Related papers (2022-04-01T15:31:50Z) - PCAL: A Privacy-preserving Intelligent Credit Risk Modeling Framework
Based on Adversarial Learning [111.19576084222345]
This paper proposes a framework of Privacy-preserving Credit risk modeling based on Adversarial Learning (PCAL)
PCAL aims to mask the private information inside the original dataset, while maintaining the important utility information for the target prediction task performance.
Results indicate that PCAL can learn an effective, privacy-free representation from user data, providing a solid foundation towards privacy-preserving machine learning for credit risk analysis.
arXiv Detail & Related papers (2020-10-06T07:04:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.