Related papers: Privacy Policy Question Answering Assistant: A Query-Guided Extractive Summarization Approach

Privacy Policy Question Answering Assistant: A Query-Guided Extractive Summarization Approach

URL: http://arxiv.org/abs/2109.14638v1
Date: Wed, 29 Sep 2021 18:00:09 GMT
Title: Privacy Policy Question Answering Assistant: A Query-Guided Extractive Summarization Approach
Authors: Moniba Keymanesh, Micha Elsner, Srinivasan Parthasarathy
Abstract summary: We propose an automated privacy policy question answering assistant that extracts a summary in response to the input user query. This is a challenging task because users articulate their privacy-related questions in a very different language than the legal language of the policy. Our pipeline is able to find an answer for 89% of the user queries in the privacyQA dataset.
Score: 18.51811191325837
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing work on making privacy policies accessible has explored new presentation forms such as color-coding based on the risk factors or summarization to assist users with conscious agreement. To facilitate a more personalized interaction with the policies, in this work, we propose an automated privacy policy question answering assistant that extracts a summary in response to the input user query. This is a challenging task because users articulate their privacy-related questions in a very different language than the legal language of the policy, making it difficult for the system to understand their inquiry. Moreover, existing annotated data in this domain are limited. We address these problems by paraphrasing to bring the style and language of the user's question closer to the language of privacy policies. Our content scoring module uses the existing in-domain data to find relevant information in the policy and incorporates it in a summary. Our pipeline is able to find an answer for 89% of the user queries in the privacyQA dataset.

Related papers

Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences [80.63946798650653]
We explore how users can stay in control of their data by using privacy profiles.<n>We build a framework where a local model uses these instructions to rewrite queries.<n>To support this research, we introduce a multilingual dataset of real user queries to mark private content.
arXiv Detail & Related papers (2025-07-07T18:22:55Z)
PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models [10.050972891318324]
We propose a privacy preservation pipeline for protecting privacy and sensitive information during interactions between users and large language models. We construct SensitiveQA, the first privacy open-ended question-answering dataset. Our proposed solution employs a multi-stage strategy aimed at preemptively securing user information while simultaneously preserving the response quality of cloud-based LLMs.
arXiv Detail & Related papers (2025-02-19T09:17:07Z)
Few-shot Policy (de)composition in Conversational Question Answering [54.259440408606515]
We propose a neuro-symbolic framework to detect policy compliance using large language models (LLMs) in a few-shot setting. We show that our approach soundly reasons about policy compliance conversations by extracting sub-questions to be answered, assigning truth values from contextual information, and explicitly producing a set of logic statements from the given policies. We apply this approach to the popular PCD and conversational machine reading benchmark, ShARC, and show competitive performance with no task-specific finetuning.
arXiv Detail & Related papers (2025-01-20T08:40:15Z)
Open Domain Question Answering with Conflicting Contexts [55.739842087655774]
We find that as much as 25% of unambiguous, open domain questions can lead to conflicting contexts when retrieved using Google Search. We ask our annotators to provide explanations for their selections of correct answers.
arXiv Detail & Related papers (2024-10-16T07:24:28Z)
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories. We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds. State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z)
Operationalizing Contextual Integrity in Privacy-Conscious Assistants [34.70330533067581]
We propose to operationalize contextual integrity (CI) to steer advanced AI assistants to behave in accordance with privacy expectations. In particular, we design and evaluate a number of strategies to steer assistants' information-sharing actions to be CI compliant. Our evaluation is based on a novel form filling benchmark composed of human annotations of common webform applications.
arXiv Detail & Related papers (2024-08-05T10:53:51Z)
PAQA: Toward ProActive Open-Retrieval Question Answering [34.883834970415734]
This work aims to tackle the challenge of generating relevant clarifying questions by taking into account the inherent ambiguities present in both user queries and documents. We propose PAQA, an extension to the existing AmbiNQ dataset, incorporating clarifying questions. We then evaluate various models and assess how passage retrieval impacts ambiguity detection and the generation of clarifying questions.
arXiv Detail & Related papers (2024-02-26T14:40:34Z)
Experts-in-the-Loop: Establishing an Effective Workflow in Crafting Privacy Q&A [0.0]
We propose a dynamic workflow for transforming privacy policies into privacy question-and-answer (Q&A) pairs. Thereby, we facilitate interdisciplinary collaboration among legal experts and conversation designers. Our proposed workflow underscores continuous improvement and monitoring throughout the construction of privacy Q&As.
arXiv Detail & Related papers (2023-11-18T20:32:59Z)
Social Commonsense-Guided Search Query Generation for Open-Domain Knowledge-Powered Conversations [66.16863141262506]
We present a novel approach that focuses on generating internet search queries guided by social commonsense. Our proposed framework addresses passive user interactions by integrating topic tracking, commonsense response generation and instruction-driven query generation.
arXiv Detail & Related papers (2023-10-22T16:14:56Z)
PLUE: Language Understanding Evaluation Benchmark for Privacy Policies in English [77.79102359580702]
We introduce the Privacy Policy Language Understanding Evaluation benchmark, a multi-task benchmark for evaluating the privacy policy language understanding. We also collect a large corpus of privacy policies to enable privacy policy domain-specific language model pre-training. We demonstrate that domain-specific continual pre-training offers performance improvements across all tasks.
arXiv Detail & Related papers (2022-12-20T05:58:32Z)
Exploring Consequences of Privacy Policies with Narrative Generation via Answer Set Programming [0.0]
We present a framework that uses Answer Set Programming (ASP) to formalize privacy policies. ASP allows end-users to forward-simulate possible consequences of the policy in terms of actors. We demonstrate through the example of the Health Insurance Portability and Accountability Act how to use the system in various ways.
arXiv Detail & Related papers (2022-12-13T16:44:46Z)
PolicyQA: A Reading Comprehension Dataset for Privacy Policies [77.79102359580702]
We present PolicyQA, a dataset that contains 25,017 reading comprehension style examples curated from an existing corpus of 115 website privacy policies. We evaluate two existing neural QA models and perform rigorous analysis to reveal the advantages and challenges offered by PolicyQA.
arXiv Detail & Related papers (2020-10-06T09:04:58Z)
Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document. We show that readers engage in a series of pragmatic strategies to seek information. We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.