Smart Privacy Policy Assistant: An LLM-Powered System for Transparent and Actionable Privacy Notices
- URL: http://arxiv.org/abs/2601.06357v1
- Date: Fri, 09 Jan 2026 23:42:59 GMT
- Title: Smart Privacy Policy Assistant: An LLM-Powered System for Transparent and Actionable Privacy Notices
- Authors: Sriharshini Kalvakuntla, Luoxi Tang, Yuqiao Meng, Zhaohan Xi,
- Abstract summary: Most users agree to online privacy policies without reading or understanding them, even though these documents govern how personal data is collected, shared, and monetized.<n>This paper presents the Smart Privacy Policy Assistant, an LLM-powered system that automatically ingests privacy policies, extracts and categorizes key clauses, assigns human-interpretable risk levels, and generates clear, concise explanations.<n>We describe the end-to-end pipeline, including policy ingestion, clause categorization, risk scoring, and explanation generation, and propose an evaluation framework based on clause-level accuracy, policy-level risk agreement, and user comprehension.
- Score: 3.914632811815449
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most users agree to online privacy policies without reading or understanding them, even though these documents govern how personal data is collected, shared, and monetized. Privacy policies are typically long, legally complex, and difficult for non-experts to interpret. This paper presents the Smart Privacy Policy Assistant, an LLM-powered system that automatically ingests privacy policies, extracts and categorizes key clauses, assigns human-interpretable risk levels, and generates clear, concise explanations. The system is designed for real-time use through browser extensions or mobile interfaces, surfacing contextual warnings before users disclose sensitive information or grant risky permissions. We describe the end-to-end pipeline, including policy ingestion, clause categorization, risk scoring, and explanation generation, and propose an evaluation framework based on clause-level accuracy, policy-level risk agreement, and user comprehension.
Related papers
- DAVE: A Policy-Enforcing LLM Spokesperson for Secure Multi-Document Data Sharing [0.0]
DAVE is a usage policy-enforcing spokesperson that answers questions over private documents on behalf of a data provider.<n>We formalize policy-violating information disclosure in this setting, drawing on usage control and information flow security.<n>Our contribution is primarily architectural: we do not yet implement or empirically evaluate the full enforcement pipeline.
arXiv Detail & Related papers (2026-02-19T14:43:48Z) - SoK: Privacy Risks and Mitigations in Retrieval-Augmented Generation Systems [53.51921540246166]
Retrieval-Augmented Generation (RAG) techniques have become widely popular.<n>RAG involves the coupling of Large Language Models (LLMs) with domain-specific knowledge bases.<n>The proliferation of RAG has sparked concerns about data privacy.
arXiv Detail & Related papers (2026-01-07T14:50:41Z) - MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation [54.410825977390274]
Existing benchmarks to evaluate contextual privacy in LLM-agents primarily assess single-turn, low-complexity tasks.<n>We first present a benchmark - MAGPIE comprising 158 real-life high-stakes scenarios across 15 domains.<n>We then evaluate the current state-of-the-art LLMs on their understanding of contextually private data and their ability to collaborate without violating user privacy.
arXiv Detail & Related papers (2025-06-25T18:04:25Z) - Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents [33.26308626066122]
We characterize the notion of contextual privacy for user interactions with Conversational Agents (LCAs)<n>It aims to minimize privacy risks by ensuring that users (sender) disclose only information that is both relevant and necessary for achieving their intended goals.<n>We propose a locally deployable framework that operates between users and LCAs, identifying and reformulating out-of-context information in user prompts.
arXiv Detail & Related papers (2025-02-22T09:05:39Z) - PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.<n>We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.<n>State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - A Human-in-the-Loop Approach for Information Extraction from Privacy
Policies under Data Scarcity [0.0]
We present a prototype system for a Human-in-the-Loop' approach to privacy policy annotation.
We propose an ML-based suggestion system specifically tailored to the constraint of data scarcity prevalent in the domain of privacy policy annotation.
arXiv Detail & Related papers (2023-05-24T10:45:26Z) - PLUE: Language Understanding Evaluation Benchmark for Privacy Policies
in English [77.79102359580702]
We introduce the Privacy Policy Language Understanding Evaluation benchmark, a multi-task benchmark for evaluating the privacy policy language understanding.
We also collect a large corpus of privacy policies to enable privacy policy domain-specific language model pre-training.
We demonstrate that domain-specific continual pre-training offers performance improvements across all tasks.
arXiv Detail & Related papers (2022-12-20T05:58:32Z) - Exploring Consequences of Privacy Policies with Narrative Generation via
Answer Set Programming [0.0]
We present a framework that uses Answer Set Programming (ASP) to formalize privacy policies.
ASP allows end-users to forward-simulate possible consequences of the policy in terms of actors.
We demonstrate through the example of the Health Insurance Portability and Accountability Act how to use the system in various ways.
arXiv Detail & Related papers (2022-12-13T16:44:46Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - Privacy Policy Question Answering Assistant: A Query-Guided Extractive
Summarization Approach [18.51811191325837]
We propose an automated privacy policy question answering assistant that extracts a summary in response to the input user query.
This is a challenging task because users articulate their privacy-related questions in a very different language than the legal language of the policy.
Our pipeline is able to find an answer for 89% of the user queries in the privacyQA dataset.
arXiv Detail & Related papers (2021-09-29T18:00:09Z) - PolicyQA: A Reading Comprehension Dataset for Privacy Policies [77.79102359580702]
We present PolicyQA, a dataset that contains 25,017 reading comprehension style examples curated from an existing corpus of 115 website privacy policies.
We evaluate two existing neural QA models and perform rigorous analysis to reveal the advantages and challenges offered by PolicyQA.
arXiv Detail & Related papers (2020-10-06T09:04:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.