NLP-based Automated Compliance Checking of Data Processing Agreements
against GDPR
- URL: http://arxiv.org/abs/2209.09722v2
- Date: Sun, 18 Jun 2023 12:59:12 GMT
- Title: NLP-based Automated Compliance Checking of Data Processing Agreements
against GDPR
- Authors: Orlando Amaral, Muhammad Ilyas Azeem, Sallam Abualhaija and Lionel C
Briand
- Abstract summary: We propose an automated solution to check compliance of a given DPA against the "shall" requirements.
Our approach correctly finds 618 out of 750 genuine violations while raising 76 false violations, and further correctly identifies 524 satisfied requirements.
- Score: 9.022562906627991
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Processing personal data is regulated in Europe by the General Data
Protection Regulation (GDPR) through data processing agreements (DPAs).
Checking the compliance of DPAs contributes to the compliance verification of
software systems as DPAs are an important source of requirements for software
development involving the processing of personal data. However, manually
checking whether a given DPA complies with GDPR is challenging as it requires
significant time and effort for understanding and identifying DPA-relevant
compliance requirements in GDPR and then verifying these requirements in the
DPA. In this paper, we propose an automated solution to check the compliance of
a given DPA against GDPR. In close interaction with legal experts, we first
built two artifacts: (i) the "shall" requirements extracted from the GDPR
provisions relevant to DPA compliance and (ii) a glossary table defining the
legal concepts in the requirements. Then, we developed an automated solution
that leverages natural language processing (NLP) technologies to check the
compliance of a given DPA against these "shall" requirements. Specifically, our
approach automatically generates phrasal-level representations for the textual
content of the DPA and compares it against predefined representations of the
"shall" requirements. Over a dataset of 30 actual DPAs, the approach correctly
finds 618 out of 750 genuine violations while raising 76 false violations, and
further correctly identifies 524 satisfied requirements. The approach has thus
an average precision of 89.1%, a recall of 82.4%, and an accuracy of 84.6%.
Compared to a baseline that relies on off-the-shelf NLP tools, our approach
provides an average accuracy gain of ~20 percentage points. The accuracy of our
approach can be improved to ~94% with limited manual verification effort.
Related papers
- Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs [54.05511925104712]
We propose a simple, effective, and data-efficient method called Step-DPO.
Step-DPO treats individual reasoning steps as units for preference optimization rather than evaluating answers holistically.
Our findings demonstrate that as few as 10K preference data pairs and fewer than 500 Step-DPO training steps can yield a nearly 3% gain in accuracy on MATH for models with over 70B parameters.
arXiv Detail & Related papers (2024-06-26T17:43:06Z) - Demystifying Legalese: An Automated Approach for Summarizing and Analyzing Overlaps in Privacy Policies and Terms of Service [0.6240153531166704]
Our work seeks to alleviate this issue by developing language models that provide automated, accessible summaries and scores for such documents.
We compared transformer-based and conventional models during training on our dataset, and RoBERTa performed better overall with a remarkable 0.74 F1-score.
arXiv Detail & Related papers (2024-04-17T19:53:59Z) - Towards an Enforceable GDPR Specification [49.1574468325115]
Privacy by Design (PbD) is prescribed by modern privacy regulations such as the EU's.
One emerging technique to realize PbD is enforcement (RE)
We present a set of requirements and an iterative methodology for creating formal specifications of legal provisions.
arXiv Detail & Related papers (2024-02-27T09:38:51Z) - A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs [3.1002416427168304]
General Data Protection Regulation (DPA) requires a data processing agreement (DPA) which regulates processing and ensures personal data remains protected.
Checking completeness of DPA according to prerequisite provisions is therefore an essential to ensure that requirements are complete.
We propose an automation strategy to address the completeness checking of DPAs against stipulated provisions.
arXiv Detail & Related papers (2023-11-23T10:05:52Z) - Better Practices for Domain Adaptation [62.70267990659201]
Domain adaptation (DA) aims to provide frameworks for adapting models to deployment data without using labels.
Unclear validation protocol for DA has led to bad practices in the literature.
We show challenges across all three branches of domain adaptation methodology.
arXiv Detail & Related papers (2023-09-07T17:44:18Z) - WiCE: Real-World Entailment for Claims in Wikipedia [63.234352061821625]
We propose WiCE, a new fine-grained textual entailment dataset built on natural claim and evidence pairs extracted from Wikipedia.
In addition to standard claim-level entailment, WiCE provides entailment judgments over sub-sentence units of the claim.
We show that real claims in our dataset involve challenging verification and retrieval problems that existing models fail to address.
arXiv Detail & Related papers (2023-03-02T17:45:32Z) - Delving into Probabilistic Uncertainty for Unsupervised Domain Adaptive
Person Re-Identification [54.174146346387204]
We propose an approach named probabilistic uncertainty guided progressive label refinery (P$2$LR) for domain adaptive person re-identification.
A quantitative criterion is established to measure the uncertainty of pseudo labels and facilitate the network training.
Our method outperforms the baseline by 6.5% mAP on the Duke2Market task, while surpassing the state-of-the-art method by 2.5% mAP on the Market2MSMT task.
arXiv Detail & Related papers (2021-12-28T07:40:12Z) - AI-enabled Automation for Completeness Checking of Privacy Policies [7.707284039078785]
In Europe, privacy policies are subject to compliance with the General Data Protection Regulation.
In this paper, we propose AI-based automation for completeness checking privacy policies.
arXiv Detail & Related papers (2021-06-10T12:10:51Z) - GDPR: When the Right to Access Personal Data Becomes a Threat [63.732639864601914]
We examine more than 300 data controllers performing for each of them a request to access personal data.
We find that 50.4% of the data controllers that handled the request, have flaws in the procedure of identifying the users.
With the undesired and surprising result that, in its present deployment, has actually decreased the privacy of the users of web services.
arXiv Detail & Related papers (2020-05-04T22:01:46Z) - Machine Understandable Policies and GDPR Compliance Checking [9.032680855473986]
Towards SPECIAL H2020 project aims to provide a set of tools that can be used by data controllers that automatically check if personal data sharing complies with obligations set forth with obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with
arXiv Detail & Related papers (2020-01-24T09:41:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.