A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs
- URL: http://arxiv.org/abs/2311.13881v1
- Date: Thu, 23 Nov 2023 10:05:52 GMT
- Title: A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs
- Authors: Muhammad Ilyas Azeem and Sallam Abualhaija
- Abstract summary: General Data Protection Regulation (DPA) requires a data processing agreement (DPA) which regulates processing and ensures personal data remains protected.
Checking completeness of DPA according to prerequisite provisions is therefore an essential to ensure that requirements are complete.
We propose an automation strategy to address the completeness checking of DPAs against stipulated provisions.
- Score: 3.1002416427168304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Specifying legal requirements for software systems to ensure their compliance
with the applicable regulations is a major concern to requirements engineering
(RE). Personal data which is collected by an organization is often shared with
other organizations to perform certain processing activities. In such cases,
the General Data Protection Regulation (GDPR) requires issuing a data
processing agreement (DPA) which regulates the processing and further ensures
that personal data remains protected. Violating GDPR can lead to huge fines
reaching to billions of Euros. Software systems involving personal data
processing must adhere to the legal obligations stipulated in GDPR and outlined
in DPAs. Requirements engineers can elicit from DPAs legal requirements for
regulating the data processing activities in software systems. Checking the
completeness of a DPA according to the GDPR provisions is therefore an
essential prerequisite to ensure that the elicited requirements are complete.
Analyzing DPAs entirely manually is time consuming and requires adequate legal
expertise. In this paper, we propose an automation strategy to address the
completeness checking of DPAs against GDPR. Specifically, we pursue ten
alternative solutions which are enabled by different technologies, namely
traditional machine learning, deep learning, language modeling, and few-shot
learning. The goal of our work is to empirically examine how these different
technologies fare in the legal domain. We computed F2 score on a set of 30 real
DPAs. Our evaluation shows that best-performing solutions yield F2 score of
86.7% and 89.7% are based on pre-trained BERT and RoBERTa language models. Our
analysis further shows that other alternative solutions based on deep learning
(e.g., BiLSTM) and few-shot learning (e.g., SetFit) can achieve comparable
accuracy, yet are more efficient to develop.
Related papers
- AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs.
Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z) - Rethinking Legal Compliance Automation: Opportunities with Large Language Models [2.9088208525097365]
We argue that the examination of (textual) legal artifacts should, first employ broader context than sentences.
We present a compliance analysis approach designed to address these limitations.
arXiv Detail & Related papers (2024-04-22T17:10:27Z) - Towards an Enforceable GDPR Specification [49.1574468325115]
Privacy by Design (PbD) is prescribed by modern privacy regulations such as the EU's.
One emerging technique to realize PbD is enforcement (RE)
We present a set of requirements and an iterative methodology for creating formal specifications of legal provisions.
arXiv Detail & Related papers (2024-02-27T09:38:51Z) - Legal Requirements Analysis [2.3349787245442966]
We explore a variety of methods for analyzing legal requirements and exemplify them on representations.
We describe possible alternatives for creating machine-analyzable representations from regulations.
arXiv Detail & Related papers (2023-11-23T09:31:57Z) - Better Practices for Domain Adaptation [62.70267990659201]
Domain adaptation (DA) aims to provide frameworks for adapting models to deployment data without using labels.
Unclear validation protocol for DA has led to bad practices in the literature.
We show challenges across all three branches of domain adaptation methodology.
arXiv Detail & Related papers (2023-09-07T17:44:18Z) - Privacy Adhering Machine Un-learning in NLP [66.17039929803933]
In real world industry use Machine Learning to build models on user data.
Such mandates require effort both in terms of data as well as model retraining.
continuous removal of data and model retraining steps do not scale.
We propose textitMachine Unlearning to tackle this challenge.
arXiv Detail & Related papers (2022-12-19T16:06:45Z) - NLP-based Automated Compliance Checking of Data Processing Agreements
against GDPR [9.022562906627991]
We propose an automated solution to check compliance of a given DPA against the "shall" requirements.
Our approach correctly finds 618 out of 750 genuine violations while raising 76 false violations, and further correctly identifies 524 satisfied requirements.
arXiv Detail & Related papers (2022-09-20T13:50:58Z) - How Much More Data Do I Need? Estimating Requirements for Downstream
Tasks [99.44608160188905]
Given a small training data set and a learning algorithm, how much more data is necessary to reach a target validation or test performance?
Overestimating or underestimating data requirements incurs substantial costs that could be avoided with an adequate budget.
Using our guidelines, practitioners can accurately estimate data requirements of machine learning systems to gain savings in both development time and data acquisition costs.
arXiv Detail & Related papers (2022-07-04T21:16:05Z) - GDPR: When the Right to Access Personal Data Becomes a Threat [63.732639864601914]
We examine more than 300 data controllers performing for each of them a request to access personal data.
We find that 50.4% of the data controllers that handled the request, have flaws in the procedure of identifying the users.
With the undesired and surprising result that, in its present deployment, has actually decreased the privacy of the users of web services.
arXiv Detail & Related papers (2020-05-04T22:01:46Z) - Machine Understandable Policies and GDPR Compliance Checking [9.032680855473986]
Towards SPECIAL H2020 project aims to provide a set of tools that can be used by data controllers that automatically check if personal data sharing complies with obligations set forth with obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with
arXiv Detail & Related papers (2020-01-24T09:41:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.