Compliance Management for Federated Data Processing
- URL: http://arxiv.org/abs/2602.19360v1
- Date: Sun, 22 Feb 2026 22:10:25 GMT
- Title: Compliance Management for Federated Data Processing
- Authors: Natallia Kokash, Adam Belloum, Paola Grosso,
- Abstract summary: Federated data processing (FDP) offers a promising approach for enabling collaborative analysis of sensitive data without centralizing raw datasets.<n>We present a framework for compliance-aware FDP that integrates policy-as-code, workflow orchestration, and large language model (LLM)-assisted compliance management.
- Score: 1.3836910960262496
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated data processing (FDP) offers a promising approach for enabling collaborative analysis of sensitive data without centralizing raw datasets. However, real-world adoption remains limited due to the complexity of managing heterogeneous access policies, regulatory requirements, and long-running workflows across organizational boundaries. In this paper, we present a framework for compliance-aware FDP that integrates policy-as-code, workflow orchestration, and large language model (LLM)-assisted compliance management. Through the implemented prototype, we show how legal and organizational requirements can be collected and translated into machine-actionable policies in FDP networks.
Related papers
- DataOps-driven CI/CD for analytics repositories [0.0]
This perspective proposes a qualitative design for a DataOps-aligned validation framework.<n>The framework consists of five stages: Lint, Optimize, Parse, and Observe.<n>A Requirements Traceability Matrix (RTM) demonstrates how each high-level control is enforced by concrete pipeline checks.
arXiv Detail & Related papers (2025-11-15T16:09:47Z) - Analyzing and Internalizing Complex Policy Documents for LLM Agents [53.14898416858099]
Large Language Model (LLM)-based agentic systems rely on in-context policy documents encoding diverse business rules.<n>This motivates developing internalization methods that embed policy documents into model priors while preserving performance.<n>We introduce CC-Gen, an agentic benchmark generator with Controllable Complexity across four levels.
arXiv Detail & Related papers (2025-10-13T16:30:07Z) - PETLP: A Privacy-by-Design Pipeline for Social Media Data in AI Research [2.185322080975722]
PETLP (Privacy-by-design Extract, Transform, Load, and Present) is a compliance framework that embeds legal safeguards directly into extended pipelines.<n>We demonstrate how extraction rights fundamentally differ between qualifying research organisations.<n>We show why true anonymisation remains unachievable for social media data.
arXiv Detail & Related papers (2025-08-12T08:33:40Z) - Lawful and Accountable Personal Data Processing with GDPR-based Access and Usage Control in Distributed Systems [0.0]
This paper proposes a case-generic method for automated normative reasoning that establishes legal arguments for the lawfulness of data processing activities.<n>The arguments are established on the basis of case-specific legal qualifications made by privacy experts, bringing the human in the loop.<n>The resulting system is designed and critically assessed in reference to requirements extracted from the GPDR.
arXiv Detail & Related papers (2025-03-10T10:49:34Z) - CBCMS: A Compliance Management System for Cross-Border Data Transfer [0.41942958779358674]
We propose Cross-Border Compliance Management System (CBCMS) for cross-border data transfer.<n>PDL supports the unified management of data processing policies, bridging the gap between natural language policies and machine-processable expressions.<n>CPGM generates compliant data processing policies with high accuracy, achieving up to 25.16% improvement in F1 score.
arXiv Detail & Related papers (2024-12-12T06:48:00Z) - RIRAG: Regulatory Information Retrieval and Answer Generation [51.998738311700095]
We introduce a task of generating question-passages pairs, where questions are automatically created and paired with relevant regulatory passages.<n>We create the ObliQA dataset, containing 27,869 questions derived from the collection of Abu Dhabi Global Markets (ADGM) financial regulation documents.<n>We design a baseline Regulatory Information Retrieval and Answer Generation (RIRAG) system and evaluate it with RePASs, a novel evaluation metric.
arXiv Detail & Related papers (2024-09-09T14:44:19Z) - Sparsity-Aware Intelligent Massive Random Access Control in Open RAN: A
Reinforcement Learning Based Approach [61.74489383629319]
Massive random access of devices in the emerging Open Radio Access Network (O-RAN) brings great challenge to the access control and management.
reinforcement-learning (RL)-assisted scheme of closed-loop access control is proposed to preserve sparsity of access requests.
Deep-RL-assisted SAUD is proposed to resolve highly complex environments with continuous and high-dimensional state and action spaces.
arXiv Detail & Related papers (2023-03-05T12:25:49Z) - Distributed-Training-and-Execution Multi-Agent Reinforcement Learning
for Power Control in HetNet [48.96004919910818]
We propose a multi-agent deep reinforcement learning (MADRL) based power control scheme for the HetNet.
To promote cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm for MADRL systems.
In this way, an agent's policy can be learned by other agents more easily, resulting in a more efficient collaboration process.
arXiv Detail & Related papers (2022-12-15T17:01:56Z) - Relational Action Bases: Formalization, Effective Safety Verification,
and Invariants (Extended Version) [67.99023219822564]
We introduce the general framework of relational action bases (RABs)
RABs generalize existing models by lifting both restrictions.
We demonstrate the effectiveness of this approach on a benchmark of data-aware business processes.
arXiv Detail & Related papers (2022-08-12T17:03:50Z) - Learning to Limit Data Collection via Scaling Laws: Data Minimization
Compliance in Practice [62.44110411199835]
We build on literature in machine learning law to propose framework for limiting collection based on data interpretation that ties data to system performance.
We formalize a data minimization criterion based on performance curve derivatives and provide an effective and interpretable piecewise power law technique.
arXiv Detail & Related papers (2021-07-16T19:59:01Z) - Modular Deep Reinforcement Learning for Continuous Motion Planning with
Temporal Logic [59.94347858883343]
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP)
The novelty is to design an embedded product MDP (EP-MDP) between the LDGBA and the MDP.
The proposed LDGBA-based reward shaping and discounting schemes for the model-free reinforcement learning (RL) only depend on the EP-MDP states.
arXiv Detail & Related papers (2021-02-24T01:11:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.