Related papers: PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification

PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification

URL: http://arxiv.org/abs/2404.19744v1
Date: Tue, 30 Apr 2024 17:44:44 GMT
Title: PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification
Authors: Leon Garza, Lavanya Elluri, Anantaa Kotal, Aritran Piplai, Deepti Gupta, Anupam Joshi,
Abstract summary: We propose a Large Language Model (LLM) and Semantic Web based approach for privacy compliance. PrivComp-KG is designed to efficiently store and retrieve comprehensive information concerning privacy policies. It can be queried to check for compliance with privacy policies by each vendor against relevant policy regulations.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Data protection and privacy is becoming increasingly crucial in the digital era. Numerous companies depend on third-party vendors and service providers to carry out critical functions within their operations, encompassing tasks such as data handling and storage. However, this reliance introduces potential vulnerabilities, as these vendors' security measures and practices may not always align with the standards expected by regulatory bodies. Businesses are required, often under the penalty of law, to ensure compliance with the evolving regulatory rules. Interpreting and implementing these regulations pose challenges due to their complexity. Regulatory documents are extensive, demanding significant effort for interpretation, while vendor-drafted privacy policies often lack the detail required for full legal compliance, leading to ambiguity. To ensure a concise interpretation of the regulatory requirements and compliance of organizational privacy policy with said regulations, we propose a Large Language Model (LLM) and Semantic Web based approach for privacy compliance. In this paper, we develop the novel Privacy Policy Compliance Verification Knowledge Graph, PrivComp-KG. It is designed to efficiently store and retrieve comprehensive information concerning privacy policies, regulatory frameworks, and domain-specific knowledge pertaining to the legal landscape of privacy. Using Retrieval Augmented Generation, we identify the relevant sections in a privacy policy with corresponding regulatory rules. This information about individual privacy policies is populated into the PrivComp-KG. Combining this with the domain context and rules, the PrivComp-KG can be queried to check for compliance with privacy policies by each vendor against relevant policy regulations. We demonstrate the relevance of the PrivComp-KG, by verifying compliance of privacy policy documents for various organizations.

Related papers

Differential Privacy Overview and Fundamental Techniques [63.0409690498569]
This chapter is meant to be part of the book "Differential Privacy in Artificial Intelligence: From Theory to Practice" It starts by illustrating various attempts to protect data privacy, emphasizing where and why they failed. It then defines the key actors, tasks, and scopes that make up the domain of privacy-preserving data analysis.
arXiv Detail & Related papers (2024-11-07T13:52:11Z)
C3PA: An Open Dataset of Expert-Annotated and Regulation-Aware Privacy Policies to Enable Scalable Regulatory Compliance Audits [7.1195014414194695]
C3PA is the first regulation-aware dataset of expert-annotated privacy policies. It contains over 48K expert-labeled privacy policy text segments associated with responses to CCPA-specific disclosure mandates.
arXiv Detail & Related papers (2024-10-04T21:04:39Z)
RIRAG: Regulatory Information Retrieval and Answer Generation [51.998738311700095]
We introduce a task of generating question-passages pairs, where questions are automatically created and paired with relevant regulatory passages. We create the ObliQA dataset, containing 27,869 questions derived from the collection of Abu Dhabi Global Markets (ADGM) financial regulation documents. We design a baseline Regulatory Information Retrieval and Answer Generation (RIRAG) system and evaluate it with RePASs, a novel evaluation metric.
arXiv Detail & Related papers (2024-09-09T14:44:19Z)
How Privacy-Savvy Are Large Language Models? A Case Study on Compliance and Privacy Technical Review [15.15468770348023]
We evaluate large language models' performance in privacy-related tasks such as privacy information extraction (PIE), legal and regulatory key point detection (KPD), and question answering (QA) Through an empirical assessment, we investigate the capacity of several prominent LLMs, including BERT, GPT-3.5, GPT-4, and custom models, in executing privacy compliance checks and technical privacy reviews. While LLMs show promise in automating privacy reviews and identifying regulatory discrepancies, significant gaps persist in their ability to fully comply with evolving legal standards.
arXiv Detail & Related papers (2024-09-04T01:51:37Z)
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories. We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds. State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z)
GoldCoin: Grounding Large Language Models in Privacy Laws via Contextual Integrity Theory [44.297102658873726]
Existing research studies privacy by exploring various privacy attacks, defenses, and evaluations within narrowly predefined patterns. We introduce a novel framework, GoldCoin, designed to efficiently ground LLMs in privacy laws for judicial assessing privacy violations. Our framework leverages the theory of contextual integrity as a bridge, creating numerous synthetic scenarios grounded in relevant privacy statutes.
arXiv Detail & Related papers (2024-06-17T02:27:32Z)
Assessing Mobile Application Privacy: A Quantitative Framework for Privacy Measurement [0.0]
This work aims to contribute to a digital environment that prioritizes privacy, promotes informed decision-making, and endorses the privacy-preserving design principles. The purpose of this framework is to systematically evaluate the level of privacy risk when using particular Android applications.
arXiv Detail & Related papers (2023-10-31T18:12:19Z)
Advancing Differential Privacy: Where We Are Now and Future Directions for Real-World Deployment [100.1798289103163]
We present a detailed review of current practices and state-of-the-art methodologies in the field of differential privacy (DP) Key points and high-level contents of the article were originated from the discussions from "Differential Privacy (DP): Challenges Towards the Next Frontier" This article aims to provide a reference point for the algorithmic and design decisions within the realm of privacy, highlighting important challenges and potential research directions.
arXiv Detail & Related papers (2023-04-14T05:29:18Z)
PLUE: Language Understanding Evaluation Benchmark for Privacy Policies in English [77.79102359580702]
We introduce the Privacy Policy Language Understanding Evaluation benchmark, a multi-task benchmark for evaluating the privacy policy language understanding. We also collect a large corpus of privacy policies to enable privacy policy domain-specific language model pre-training. We demonstrate that domain-specific continual pre-training offers performance improvements across all tasks.
arXiv Detail & Related papers (2022-12-20T05:58:32Z)
Exploring Consequences of Privacy Policies with Narrative Generation via Answer Set Programming [0.0]
We present a framework that uses Answer Set Programming (ASP) to formalize privacy policies. ASP allows end-users to forward-simulate possible consequences of the policy in terms of actors. We demonstrate through the example of the Health Insurance Portability and Accountability Act how to use the system in various ways.
arXiv Detail & Related papers (2022-12-13T16:44:46Z)
Distributed Machine Learning and the Semblance of Trust [66.1227776348216]
Federated Learning (FL) allows the data owner to maintain data governance and perform model training locally without having to share their data. FL and related techniques are often described as privacy-preserving. We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind.
arXiv Detail & Related papers (2021-12-21T08:44:05Z)
Detecting Compliance of Privacy Policies with Data Protection Laws [0.0]
Privacy policies are often written in extensive legal jargon that is difficult to understand. We aim to bridge that gap by providing a framework that analyzes privacy policies in light of various data protection laws. By using such a tool, users would be better equipped to understand how their personal data is managed.
arXiv Detail & Related papers (2021-02-21T09:15:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.