Automated Detection of GDPR Disclosure Requirements in Privacy Policies
using Deep Active Learning
- URL: http://arxiv.org/abs/2111.04224v1
- Date: Mon, 8 Nov 2021 01:28:27 GMT
- Title: Automated Detection of GDPR Disclosure Requirements in Privacy Policies
using Deep Active Learning
- Authors: Tamjid Al Rahat, Tu Le, Yuan Tian
- Abstract summary: Most privacy policies are verbose, full of jargon, and vaguely describe companies' data practices and users' rights.
In this paper, we create a privacy policy dataset of 1,080 websites labeled with the 18 requirements.
We develop a Convolutional Network (CNN) based model which can classify the privacy policies with an accuracy of 89.2%.
- Score: 3.659023646021795
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Since GDPR came into force in May 2018, companies have worked on their data
practices to comply with this privacy law. In particular, since the privacy
policy is the essential communication channel for users to understand and
control their privacy, many companies updated their privacy policies after GDPR
was enforced. However, most privacy policies are verbose, full of jargon, and
vaguely describe companies' data practices and users' rights. Therefore, it is
unclear if they comply with GDPR. In this paper, we create a privacy policy
dataset of 1,080 websites labeled with the 18 GDPR requirements and develop a
Convolutional Neural Network (CNN) based model which can classify the privacy
policies with an accuracy of 89.2%. We apply our model to perform a measurement
on the compliance in the privacy policies. Our results show that even after
GDPR went into effect, 97% of websites still fail to comply with at least one
requirement of GDPR.
Related papers
- PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.
We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.
State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - A BERT-based Empirical Study of Privacy Policies' Compliance with GDPR [9.676166100354282]
This study aims to address challenge of compliance analysis between privacy policies for 5G networks.
We manually collected privacy policies from almost 70 different MNOs and we utilized an automated BERT-based model for classification.
In addition, we present first empirical evidence on the readability of privacy policies for 5G network. we adopted incorporates various established readability metrics.
arXiv Detail & Related papers (2024-07-09T11:47:52Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - OPPO: An Ontology for Describing Fine-Grained Data Practices in Privacy Policies of Online Social Networks [0.8287206589886879]
Data practices of OPPO Social Networks (OSNS) comply with privacy regulations such as EU and CCPA.
This paper presents an On-Nology for Privacy Policies of OSNSs, that aims to fill gaps by formalizing detailed practices from OSNSs.
arXiv Detail & Related papers (2023-09-27T19:42:05Z) - Is It a Trap? A Large-scale Empirical Study And Comprehensive Assessment
of Online Automated Privacy Policy Generators for Mobile Apps [15.181098379077344]
Automated Privacy Policy Generators can create privacy policies for mobile apps.
Nearly 20.1% of privacy policies could be generated by existing APPGs.
App developers must carefully select and use the appropriate APPGs to avoid potential pitfalls.
arXiv Detail & Related papers (2023-05-05T04:08:18Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - AI-enabled Automation for Completeness Checking of Privacy Policies [7.707284039078785]
In Europe, privacy policies are subject to compliance with the General Data Protection Regulation.
In this paper, we propose AI-based automation for completeness checking privacy policies.
arXiv Detail & Related papers (2021-06-10T12:10:51Z) - Detecting Compliance of Privacy Policies with Data Protection Laws [0.0]
Privacy policies are often written in extensive legal jargon that is difficult to understand.
We aim to bridge that gap by providing a framework that analyzes privacy policies in light of various data protection laws.
By using such a tool, users would be better equipped to understand how their personal data is managed.
arXiv Detail & Related papers (2021-02-21T09:15:15Z) - Second layer data governance for permissioned blockchains: the privacy
management challenge [58.720142291102135]
In pandemic situations, such as the COVID-19 and Ebola outbreak, the action related to sharing health data is crucial to avoid the massive infection and decrease the number of deaths.
In this sense, permissioned blockchain technology emerges to empower users to get their rights providing data ownership, transparency, and security through an immutable, unified, and distributed database ruled by smart contracts.
arXiv Detail & Related papers (2020-10-22T13:19:38Z) - PCAL: A Privacy-preserving Intelligent Credit Risk Modeling Framework
Based on Adversarial Learning [111.19576084222345]
This paper proposes a framework of Privacy-preserving Credit risk modeling based on Adversarial Learning (PCAL)
PCAL aims to mask the private information inside the original dataset, while maintaining the important utility information for the target prediction task performance.
Results indicate that PCAL can learn an effective, privacy-free representation from user data, providing a solid foundation towards privacy-preserving machine learning for credit risk analysis.
arXiv Detail & Related papers (2020-10-06T07:04:59Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.