Automated Detection and Analysis of Data Practices Using A Real-World
Corpus
- URL: http://arxiv.org/abs/2402.11006v1
- Date: Fri, 16 Feb 2024 18:51:40 GMT
- Title: Automated Detection and Analysis of Data Practices Using A Real-World
Corpus
- Authors: Mukund Srinath, Pranav Venkit, Maria Badillo, Florian Schaub, C. Lee
Giles, Shomir Wilson
- Abstract summary: We propose an automated approach to identify and visualize data practices within privacy policies at different levels of detail.
Our approach accurately matches data practice descriptions with policy excerpts, facilitating the presentation of simplified privacy information to users.
- Score: 20.4572759138767
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Privacy policies are crucial for informing users about data practices, yet
their length and complexity often deter users from reading them. In this paper,
we propose an automated approach to identify and visualize data practices
within privacy policies at different levels of detail. Leveraging crowd-sourced
annotations from the ToS;DR platform, we experiment with various methods to
match policy excerpts with predefined data practice descriptions. We further
conduct a case study to evaluate our approach on a real-world policy,
demonstrating its effectiveness in simplifying complex policies. Experiments
show that our approach accurately matches data practice descriptions with
policy excerpts, facilitating the presentation of simplified privacy
information to users.
Related papers
- Entailment-Driven Privacy Policy Classification with LLMs [3.564208334473993]
We propose a framework to classify paragraphs of privacy policies into meaningful labels that are easily understood by users.
Our framework improves the F1 score in average by 11.2%.
arXiv Detail & Related papers (2024-09-25T05:07:05Z) - Privacy Policy Analysis through Prompt Engineering for LLMs [3.059256166047627]
PAPEL (Privacy Policy Analysis through Prompt Engineering for LLMs) is a framework harnessing the power of Large Language Models (LLMs) to automate the analysis of privacy policies.
It aims to streamline the extraction, annotation, and summarization of information from these policies, enhancing their accessibility and comprehensibility without requiring additional model training.
We demonstrate the effectiveness of PAPEL with two applications: (i) annotation and (ii) contradiction analysis.
arXiv Detail & Related papers (2024-09-23T10:23:31Z) - One-Shot Learning as Instruction Data Prospector for Large Language Models [108.81681547472138]
textscNuggets uses one-shot learning to select high-quality instruction data from extensive datasets.
We show that instruction tuning with the top 1% of examples curated by textscNuggets substantially outperforms conventional methods employing the entire dataset.
arXiv Detail & Related papers (2023-12-16T03:33:12Z) - Counterfactual Learning with General Data-generating Policies [3.441021278275805]
We develop an OPE method for a class of full support and deficient support logging policies in contextual-bandit settings.
We prove that our method's prediction converges in probability to the true performance of a counterfactual policy as the sample size increases.
arXiv Detail & Related papers (2022-12-04T21:07:46Z) - Data augmentation for efficient learning from parametric experts [88.33380893179697]
We focus on what we call the policy cloning setting, in which we use online or offline queries of an expert to inform the behavior of a student policy.
Our approach, augmented policy cloning (APC), uses synthetic states to induce feedback-sensitivity in a region around sampled trajectories.
We achieve highly data-efficient transfer of behavior from an expert to a student policy for high-degrees-of-freedom control problems.
arXiv Detail & Related papers (2022-05-23T16:37:16Z) - A Regularized Implicit Policy for Offline Reinforcement Learning [54.7427227775581]
offline reinforcement learning enables learning from a fixed dataset, without further interactions with the environment.
We propose a framework that supports learning a flexible yet well-regularized fully-implicit policy.
Experiments and ablation study on the D4RL dataset validate our framework and the effectiveness of our algorithmic designs.
arXiv Detail & Related papers (2022-02-19T20:22:04Z) - Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients [54.98496284653234]
We consider the task of training a policy that maximizes reward while minimizing disclosure of certain sensitive state variables through the actions.
We solve this problem by introducing a regularizer based on the mutual information between the sensitive state and the actions.
We develop a model-based estimator for optimization of privacy-constrained policies.
arXiv Detail & Related papers (2020-12-30T03:22:35Z) - Policy Evaluation Networks [50.53250641051648]
We introduce a scalable, differentiable fingerprinting mechanism that retains essential policy information in a concise embedding.
Our empirical results demonstrate that combining these three elements can produce policies that outperform those that generated the training data.
arXiv Detail & Related papers (2020-02-26T23:00:27Z) - A Comparative Study of Sequence Classification Models for Privacy Policy
Coverage Analysis [0.0]
Privacy policies are legal documents that describe how a website will collect, use, and distribute a user's data.
Our solution is to provide users with a coverage analysis of a given website's privacy policy using a wide range of classical machine learning and deep learning techniques.
arXiv Detail & Related papers (2020-02-12T21:46:22Z) - Reward-Conditioned Policies [100.64167842905069]
imitation learning requires near-optimal expert data.
Can we learn effective policies via supervised learning without demonstrations?
We show how such an approach can be derived as a principled method for policy search.
arXiv Detail & Related papers (2019-12-31T18:07:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.