Beyond The Text: Analysis of Privacy Statements through Syntactic and
Semantic Role Labeling
- URL: http://arxiv.org/abs/2010.00678v1
- Date: Thu, 1 Oct 2020 20:48:37 GMT
- Title: Beyond The Text: Analysis of Privacy Statements through Syntactic and
Semantic Role Labeling
- Authors: Yan Shvartzshnaider, Ananth Balashankar, Vikas Patidar, Thomas Wies,
Lakshminarayanan Subramanian
- Abstract summary: This paper formulates a new task of extracting privacy parameters from a privacy policy, through the lens of Contextual Integrity.
We show that traditional NLP tasks, including the recently proposed Question-Answering based solutions, are insufficient to address the privacy parameter extraction problem.
- Score: 12.74252812104216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper formulates a new task of extracting privacy parameters from a
privacy policy, through the lens of Contextual Integrity, an established social
theory framework for reasoning about privacy norms. Privacy policies, written
by lawyers, are lengthy and often comprise incomplete and vague statements. In
this paper, we show that traditional NLP tasks, including the recently proposed
Question-Answering based solutions, are insufficient to address the privacy
parameter extraction problem and provide poor precision and recall. We describe
4 different types of conventional methods that can be partially adapted to
address the parameter extraction task with varying degrees of success: Hidden
Markov Models, BERT fine-tuned models, Dependency Type Parsing (DP) and
Semantic Role Labeling (SRL). Based on a detailed evaluation across 36
real-world privacy policies of major enterprises, we demonstrate that a
solution combining syntactic DP coupled with type-specific SRL tasks provides
the highest accuracy for retrieving contextual privacy parameters from privacy
statements. We also observe that incorporating domain-specific knowledge is
critical to achieving high precision and recall, thus inspiring new NLP
research to address this important problem in the privacy domain.
Related papers
- PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.
We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.
State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - Private Optimal Inventory Policy Learning for Feature-based Newsvendor with Unknown Demand [13.594765018457904]
This paper introduces a novel approach to estimate a privacy-preserving optimal inventory policy within the f-differential privacy framework.
We develop a clipped noisy gradient descent algorithm based on convolution smoothing for optimal inventory estimation.
Our numerical experiments demonstrate that the proposed new method can achieve desirable privacy protection with a marginal increase in cost.
arXiv Detail & Related papers (2024-04-23T19:15:43Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - A Randomized Approach for Tight Privacy Accounting [63.67296945525791]
We propose a new differential privacy paradigm called estimate-verify-release (EVR)
EVR paradigm first estimates the privacy parameter of a mechanism, then verifies whether it meets this guarantee, and finally releases the query output.
Our empirical evaluation shows the newly proposed EVR paradigm improves the utility-privacy tradeoff for privacy-preserving machine learning.
arXiv Detail & Related papers (2023-04-17T00:38:01Z) - DP-BART for Privatized Text Rewriting under Local Differential Privacy [2.45626162429986]
We propose a new system 'DP-BART' that largely outperforms existing LDP systems.
Our approach uses a novel clipping method, iterative pruning, and further training of internal representations which drastically reduces the amount of noise required for DP guarantees.
arXiv Detail & Related papers (2023-02-15T13:07:34Z) - Algorithms with More Granular Differential Privacy Guarantees [65.3684804101664]
We consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis.
In this work, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person.
arXiv Detail & Related papers (2022-09-08T22:43:50Z) - ADePT: Auto-encoder based Differentially Private Text Transformation [22.068984615657463]
We provide a utility-preserving differentially private text transformation algorithm using auto-encoders.
Our algorithm transforms text to offer robustness against attacks and produces transformations with high semantic quality.
Our results show that the proposed model performs better against MIA attacks while offering lower to no degradation in the utility of the underlying transformation process.
arXiv Detail & Related papers (2021-01-29T23:15:24Z) - Differentially Private Representation for NLP: Formal Guarantee and An
Empirical Study on Privacy and Fairness [38.90014773292902]
It has been demonstrated that hidden representation learned by a deep model can encode private information of the input.
We propose Differentially Private Neural Representation (DPNR) to preserve the privacy of the extracted representation from text.
arXiv Detail & Related papers (2020-10-03T05:58:32Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.