Cross-Policy Compliance Detection via Question Answering
- URL: http://arxiv.org/abs/2109.03731v1
- Date: Wed, 8 Sep 2021 15:47:41 GMT
- Title: Cross-Policy Compliance Detection via Question Answering
- Authors: Marzieh Saeidi, Majid Yazdani, Andreas Vlachos
- Abstract summary: We propose to address policy compliance detection via decomposing it into question answering.
We demonstrate that this approach results in better accuracy, especially in the cross-policy setup.
It explicitly identifies the information missing from a scenario in case policy compliance cannot be determined.
- Score: 13.373804837863155
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Policy compliance detection is the task of ensuring that a scenario conforms
to a policy (e.g. a claim is valid according to government rules or a post in
an online platform conforms to community guidelines). This task has been
previously instantiated as a form of textual entailment, which results in poor
accuracy due to the complexity of the policies. In this paper we propose to
address policy compliance detection via decomposing it into question answering,
where questions check whether the conditions stated in the policy apply to the
scenario, and an expression tree combines the answers to obtain the label.
Despite the initial upfront annotation cost, we demonstrate that this approach
results in better accuracy, especially in the cross-policy setup where the
policies during testing are unseen in training. In addition, it allows us to
use existing question answering models pre-trained on existing large datasets.
Finally, it explicitly identifies the information missing from a scenario in
case policy compliance cannot be determined. We conduct our experiments using a
recent dataset consisting of government policies, which we augment with expert
annotations and find that the cost of annotating question answering
decomposition is largely offset by improved inter-annotator agreement and
speed.
Related papers
- Few-shot Policy (de)composition in Conversational Question Answering [54.259440408606515]
We propose a neuro-symbolic framework to detect policy compliance using large language models (LLMs) in a few-shot setting.
We show that our approach soundly reasons about policy compliance conversations by extracting sub-questions to be answered, assigning truth values from contextual information, and explicitly producing a set of logic statements from the given policies.
We apply this approach to the popular PCD and conversational machine reading benchmark, ShARC, and show competitive performance with no task-specific finetuning.
arXiv Detail & Related papers (2025-01-20T08:40:15Z) - Picachv: Formally Verified Data Use Policy Enforcement for Secure Data Analytics [10.630556229470681]
We introduce Picachv, a novel security monitor that automatically enforces data use policies.
It works on relational algebra as an abstraction for program semantics, enabling policy enforcement on query plans generated by programs during execution.
We integrate Picachv into Polars, a state-of-the-art data analytics framework, and evaluate its performance using the TPC-H benchmark.
arXiv Detail & Related papers (2025-01-17T21:30:55Z) - Statistical Analysis of Policy Space Compression Problem [54.1754937830779]
Policy search methods are crucial in reinforcement learning, offering a framework to address continuous state-action and partially observable problems.
Reducing the policy space through policy compression emerges as a powerful, reward-free approach to accelerate the learning process.
This technique condenses the policy space into a smaller, representative set while maintaining most of the original effectiveness.
arXiv Detail & Related papers (2024-11-15T02:46:55Z) - Information Capacity Regret Bounds for Bandits with Mediator Feedback [55.269551124587224]
We introduce the policy set capacity as an information-theoretic measure for the complexity of the policy set.
Adopting the classical EXP4 algorithm, we provide new regret bounds depending on the policy set capacity.
For a selection of policy set families, we prove nearly-matching lower bounds, scaling similarly with the capacity.
arXiv Detail & Related papers (2024-02-15T19:18:47Z) - Off-Policy Evaluation for Large Action Spaces via Policy Convolution [60.6953713877886]
Policy Convolution family of estimators uses latent structure within actions to strategically convolve the logging and target policies.
Experiments on synthetic and benchmark datasets demonstrate remarkable mean squared error (MSE) improvements when using PC.
arXiv Detail & Related papers (2023-10-24T01:00:01Z) - Conformal Off-Policy Evaluation in Markov Decision Processes [53.786439742572995]
Reinforcement Learning aims at identifying and evaluating efficient control policies from data.
Most methods for this learning task, referred to as Off-Policy Evaluation (OPE), do not come with accuracy and certainty guarantees.
We present a novel OPE method based on Conformal Prediction that outputs an interval containing the true reward of the target policy with a prescribed level of certainty.
arXiv Detail & Related papers (2023-04-05T16:45:11Z) - Distributionally Robust Batch Contextual Bandits [20.667213458836734]
Policy learning using historical observational data is an important problem that has found widespread applications.
Existing literature rests on the crucial assumption that the future environment where the learned policy will be deployed is the same as the past environment.
In this paper, we lift this assumption and aim to learn a distributionally robust policy with incomplete observational data.
arXiv Detail & Related papers (2020-06-10T03:11:40Z) - Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic
Policies [80.42316902296832]
We study the estimation of policy value and gradient of a deterministic policy from off-policy data when actions are continuous.
In this setting, standard importance sampling and doubly robust estimators for policy value and gradient fail because the density ratio does not exist.
We propose several new doubly robust estimators based on different kernelization approaches.
arXiv Detail & Related papers (2020-06-06T15:52:05Z) - Fast Compliance Checking with General Vocabularies [0.0]
We introduce an OWL2 profile for representing data protection policies.
With this language, a company's data usage policy can be checked for compliance with data subjects' consent.
We exploit IBQ reasoning to integrate specialized reasoners for the policy language and the vocabulary's language.
arXiv Detail & Related papers (2020-01-16T09:08:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.