Creation and Analysis of an International Corpus of Privacy Laws
- URL: http://arxiv.org/abs/2206.14169v1
- Date: Tue, 28 Jun 2022 17:36:12 GMT
- Title: Creation and Analysis of an International Corpus of Privacy Laws
- Authors: Sonu Gupta, Ellen Poplavska, Nora O'Toole, Siddhant Arora, Thomas
Norton, Norman Sadeh, Shomir Wilson
- Abstract summary: We introduce the Government Privacy Instructions Corpus, or GPI Corpus, of 1,043 privacy laws, regulations, and guidelines, covering 182 jurisdictions.
We examine the temporal distribution of when GPIs were created and illustrate the dramatic increase in privacy legislation over the past 50 years.
Our exploration also demonstrates that most privacy laws respectively address relatively few personal data types, showing that comprehensive privacy legislation remains rare.
- Score: 7.45571096955396
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The landscape of privacy laws and regulations around the world is complex and
ever-changing. National and super-national laws, agreements, decrees, and other
government-issued rules form a patchwork that companies must follow to operate
internationally. To examine the status and evolution of this patchwork, we
introduce the Government Privacy Instructions Corpus, or GPI Corpus, of 1,043
privacy laws, regulations, and guidelines, covering 182 jurisdictions. This
corpus enables a large-scale quantitative and qualitative examination of legal
foci on privacy. We examine the temporal distribution of when GPIs were created
and illustrate the dramatic increase in privacy legislation over the past 50
years, although a finer-grained examination reveals that the rate of increase
varies depending on the personal data types that GPIs address. Our exploration
also demonstrates that most privacy laws respectively address relatively few
personal data types, showing that comprehensive privacy legislation remains
rare. Additionally, topic modeling results show the prevalence of common themes
in GPIs, such as finance, healthcare, and telecommunications. Finally, we
release the corpus to the research community to promote further study.
Related papers
- How Privacy-Savvy Are Large Language Models? A Case Study on Compliance and Privacy Technical Review [15.15468770348023]
We evaluate large language models' performance in privacy-related tasks such as privacy information extraction (PIE), legal and regulatory key point detection (KPD), and question answering (QA)
Through an empirical assessment, we investigate the capacity of several prominent LLMs, including BERT, GPT-3.5, GPT-4, and custom models, in executing privacy compliance checks and technical privacy reviews.
While LLMs show promise in automating privacy reviews and identifying regulatory discrepancies, significant gaps persist in their ability to fully comply with evolving legal standards.
arXiv Detail & Related papers (2024-09-04T01:51:37Z) - PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.
We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.
State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory [43.12744258781724]
We formulate the privacy issue as a reasoning problem rather than simple pattern matching.
We develop the first comprehensive checklist that covers social identities, private attributes, and existing privacy regulations.
arXiv Detail & Related papers (2024-08-19T14:48:04Z) - GoldCoin: Grounding Large Language Models in Privacy Laws via Contextual Integrity Theory [44.297102658873726]
Existing research studies privacy by exploring various privacy attacks, defenses, and evaluations within narrowly predefined patterns.
We introduce a novel framework, GoldCoin, designed to efficiently ground LLMs in privacy laws for judicial assessing privacy violations.
Our framework leverages the theory of contextual integrity as a bridge, creating numerous synthetic scenarios grounded in relevant privacy statutes.
arXiv Detail & Related papers (2024-06-17T02:27:32Z) - PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification [0.0]
We propose a Large Language Model (LLM) and Semantic Web based approach for privacy compliance.
PrivComp-KG is designed to efficiently store and retrieve comprehensive information concerning privacy policies.
It can be queried to check for compliance with privacy policies by each vendor against relevant policy regulations.
arXiv Detail & Related papers (2024-04-30T17:44:44Z) - SoK: The Gap Between Data Rights Ideals and Reality [46.14715472341707]
Do rights-based privacy laws effectively empower individuals over their data?
This paper scrutinizes these approaches by reviewing empirical studies, news articles, and blog posts.
arXiv Detail & Related papers (2023-12-03T21:52:51Z) - PLUE: Language Understanding Evaluation Benchmark for Privacy Policies
in English [77.79102359580702]
We introduce the Privacy Policy Language Understanding Evaluation benchmark, a multi-task benchmark for evaluating the privacy policy language understanding.
We also collect a large corpus of privacy policies to enable privacy policy domain-specific language model pre-training.
We demonstrate that domain-specific continual pre-training offers performance improvements across all tasks.
arXiv Detail & Related papers (2022-12-20T05:58:32Z) - Algorithms with More Granular Differential Privacy Guarantees [65.3684804101664]
We consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis.
In this work, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person.
arXiv Detail & Related papers (2022-09-08T22:43:50Z) - Ctrl-Shift: How Privacy Sentiment Changed from 2019 to 2021 [14.600192799641077]
We study the sentiments of people in the U.S. toward collection and use of data for government- and health-related purposes from 2019-2021.
After the onset of COVID-19, we observe significant decreases in respondent acceptance of government data use.
Following the 2020 U.S. national elections, we observe some of the first evidence that privacy sentiments may change based on the alignment between a user's politics and the political party in power.
arXiv Detail & Related papers (2021-10-18T16:13:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.