Related papers: A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs

A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs

URL: http://arxiv.org/abs/2311.13881v1
Date: Thu, 23 Nov 2023 10:05:52 GMT
Title: A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs
Authors: Muhammad Ilyas Azeem and Sallam Abualhaija
Abstract summary: General Data Protection Regulation (DPA) requires a data processing agreement (DPA) which regulates processing and ensures personal data remains protected. Checking completeness of DPA according to prerequisite provisions is therefore an essential to ensure that requirements are complete. We propose an automation strategy to address the completeness checking of DPAs against stipulated provisions.
Score: 3.1002416427168304
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Specifying legal requirements for software systems to ensure their compliance with the applicable regulations is a major concern to requirements engineering (RE). Personal data which is collected by an organization is often shared with other organizations to perform certain processing activities. In such cases, the General Data Protection Regulation (GDPR) requires issuing a data processing agreement (DPA) which regulates the processing and further ensures that personal data remains protected. Violating GDPR can lead to huge fines reaching to billions of Euros. Software systems involving personal data processing must adhere to the legal obligations stipulated in GDPR and outlined in DPAs. Requirements engineers can elicit from DPAs legal requirements for regulating the data processing activities in software systems. Checking the completeness of a DPA according to the GDPR provisions is therefore an essential prerequisite to ensure that the elicited requirements are complete. Analyzing DPAs entirely manually is time consuming and requires adequate legal expertise. In this paper, we propose an automation strategy to address the completeness checking of DPAs against GDPR. Specifically, we pursue ten alternative solutions which are enabled by different technologies, namely traditional machine learning, deep learning, language modeling, and few-shot learning. The goal of our work is to empirically examine how these different technologies fare in the legal domain. We computed F2 score on a set of 30 real DPAs. Our evaluation shows that best-performing solutions yield F2 score of 86.7% and 89.7% are based on pre-trained BERT and RoBERTa language models. Our analysis further shows that other alternative solutions based on deep learning (e.g., BiLSTM) and few-shot learning (e.g., SetFit) can achieve comparable accuracy, yet are more efficient to develop.

Related papers

Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs [58.24692529185971]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z)
PIPA: Preference Alignment as Prior-Informed Statistical Estimation [57.24096291517857]
We introduce Pior-Informed Preference Alignment (PIPA), a unified, RL-free probabilistic framework. PIPA accommodates both paired and unpaired data, as well as answer and step-level annotations. By integrating different types of prior information, we developed two variations of PIPA: PIPA-M and PIPA-N.
arXiv Detail & Related papers (2025-02-09T04:31:30Z)
AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs. Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z)
Rethinking Legal Compliance Automation: Opportunities with Large Language Models [2.9088208525097365]
We argue that the examination of (textual) legal artifacts should, first employ broader context than sentences. We present a compliance analysis approach designed to address these limitations.
arXiv Detail & Related papers (2024-04-22T17:10:27Z)
Towards an Enforceable GDPR Specification [49.1574468325115]
Privacy by Design (PbD) is prescribed by modern privacy regulations such as the EU's. One emerging technique to realize PbD is enforcement (RE) We present a set of requirements and an iterative methodology for creating formal specifications of legal provisions.
arXiv Detail & Related papers (2024-02-27T09:38:51Z)
Legal Requirements Analysis [2.3349787245442966]
We explore a variety of methods for analyzing legal requirements and exemplify them on representations. We describe possible alternatives for creating machine-analyzable representations from regulations.
arXiv Detail & Related papers (2023-11-23T09:31:57Z)
Better Practices for Domain Adaptation [62.70267990659201]
Domain adaptation (DA) aims to provide frameworks for adapting models to deployment data without using labels. Unclear validation protocol for DA has led to bad practices in the literature. We show challenges across all three branches of domain adaptation methodology.
arXiv Detail & Related papers (2023-09-07T17:44:18Z)
Privacy Adhering Machine Un-learning in NLP [66.17039929803933]
In real world industry use Machine Learning to build models on user data. Such mandates require effort both in terms of data as well as model retraining. continuous removal of data and model retraining steps do not scale. We propose textitMachine Unlearning to tackle this challenge.
arXiv Detail & Related papers (2022-12-19T16:06:45Z)
NLP-based Automated Compliance Checking of Data Processing Agreements against GDPR [9.022562906627991]
We propose an automated solution to check compliance of a given DPA against the "shall" requirements. Our approach correctly finds 618 out of 750 genuine violations while raising 76 false violations, and further correctly identifies 524 satisfied requirements.
arXiv Detail & Related papers (2022-09-20T13:50:58Z)
How Much More Data Do I Need? Estimating Requirements for Downstream Tasks [99.44608160188905]
Given a small training data set and a learning algorithm, how much more data is necessary to reach a target validation or test performance? Overestimating or underestimating data requirements incurs substantial costs that could be avoided with an adequate budget. Using our guidelines, practitioners can accurately estimate data requirements of machine learning systems to gain savings in both development time and data acquisition costs.
arXiv Detail & Related papers (2022-07-04T21:16:05Z)
GDPR: When the Right to Access Personal Data Becomes a Threat [63.732639864601914]
We examine more than 300 data controllers performing for each of them a request to access personal data. We find that 50.4% of the data controllers that handled the request, have flaws in the procedure of identifying the users. With the undesired and surprising result that, in its present deployment, has actually decreased the privacy of the users of web services.
arXiv Detail & Related papers (2020-05-04T22:01:46Z)
Machine Understandable Policies and GDPR Compliance Checking [9.032680855473986]
Towards SPECIAL H2020 project aims to provide a set of tools that can be used by data controllers that automatically check if personal data sharing complies with obligations set forth with obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with regulatory obligations set forth with
arXiv Detail & Related papers (2020-01-24T09:41:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.