Dr.Aid: Supporting Data-governance Rule Compliance for Decentralized
Collaboration in an Automated Way
- URL: http://arxiv.org/abs/2110.01056v1
- Date: Sun, 3 Oct 2021 17:59:28 GMT
- Title: Dr.Aid: Supporting Data-governance Rule Compliance for Decentralized
Collaboration in an Automated Way
- Authors: Rui Zhao, Malcolm Atkinson, Petros Papapanagiotou, Federica Magnoni,
Jacques Fleuriot
- Abstract summary: Dr.Aid is a framework that helps individuals, organisations and federations comply with data rules.
It encodes data-governance rules using a formal language and performs reasoning on data-flow graphs.
We evaluate the model in three aspects by encoding real-life data-use policies from diverse fields.
- Score: 7.744664716152106
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Collaboration across institutional boundaries is widespread and increasing
today. It depends on federations sharing data that often have governance rules
or external regulations restricting their use. However, the handling of data
governance rules (aka. data-use policies) remains manual, time-consuming and
error-prone, limiting the rate at which collaborations can form and respond to
challenges and opportunities, inhibiting citizen science and reducing data
providers' trust in compliance. Using an automated system to facilitate
compliance handling reduces substantially the time needed for such non-mission
work, thereby accelerating collaboration and improving productivity. We present
a framework, Dr.Aid, that helps individuals, organisations and federations
comply with data rules, using automation to track which rules are applicable as
data is passed between processes and as derived data is generated. It encodes
data-governance rules using a formal language and performs reasoning on
multi-input-multi-output data-flow graphs in decentralised contexts. We test
its power and utility by working with users performing cyclone tracking and
earthquake modelling to support mitigation and emergency response. We query
standard provenance traces to detach Dr.Aid from details of the tools and
systems they are using, as these inevitably vary across members of a federation
and through time. We evaluate the model in three aspects by encoding real-life
data-use policies from diverse fields, showing its capability for real-world
usage and its advantages compared with traditional frameworks. We argue that
this approach will lead to more agile, more productive and more trustworthy
collaborations and show that the approach can be adopted incrementally. This,
in-turn, will allow more appropriate data policies to emerge opening up new
forms of collaboration.
Related papers
- Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z) - Robot Fleet Learning via Policy Merging [58.5086287737653]
We propose FLEET-MERGE to efficiently merge policies in the fleet setting.
We show that FLEET-MERGE consolidates the behavior of policies trained on 50 tasks in the Meta-World environment.
We introduce a novel robotic tool-use benchmark, FLEET-TOOLS, for fleet policy learning in compositional and contact-rich robot manipulation tasks.
arXiv Detail & Related papers (2023-10-02T17:23:51Z) - DBFed: Debiasing Federated Learning Framework based on
Domain-Independent [15.639705798326213]
We propose a debiasing federated learning framework based on domain-independent, which mitigates model bias by explicitly encoding sensitive attributes during client-side training.
This paper conducts experiments on three real datasets and uses five evaluation metrics of accuracy and fairness to quantify the effect of the model.
arXiv Detail & Related papers (2023-07-10T14:39:57Z) - Knowledge Transfer from Teachers to Learners in Growing-Batch
Reinforcement Learning [8.665235113831685]
Control policies in real-world domains are typically trained offline from previously logged data or in a growing-batch manner.
In this setting a fixed policy is deployed to the environment and used to gather an entire batch of new data before being aggregated with past batches and used to update the policy.
While a limited number of such cycles is feasible in real-world domains, the quality and diversity of the resulting data are much lower than in the standard continually-interacting approach.
arXiv Detail & Related papers (2023-05-05T22:55:34Z) - Benchmarking FedAvg and FedCurv for Image Classification Tasks [1.376408511310322]
This paper focuses on the problem of statistical heterogeneity of the data in the same federated network.
Several Federated Learning algorithms, such as FedAvg, FedProx and Federated Curvature (FedCurv) have already been proposed.
As a side product of this work, we release the non-IID version of the datasets we used so to facilitate further comparisons from the FL community.
arXiv Detail & Related papers (2023-03-31T10:13:01Z) - Combating Exacerbated Heterogeneity for Robust Models in Federated
Learning [91.88122934924435]
Combination of adversarial training and federated learning can lead to the undesired robustness deterioration.
We propose a novel framework called Slack Federated Adversarial Training (SFAT)
We verify the rationality and effectiveness of SFAT on various benchmarked and real-world datasets.
arXiv Detail & Related papers (2023-03-01T06:16:15Z) - Federated Anomaly Detection over Distributed Data Streams [0.0]
We propose an approach to building the bridge among anomaly detection, federated learning, and data streams.
The overarching goal of the work is to detect anomalies in a federated environment over distributed data streams.
arXiv Detail & Related papers (2022-05-16T17:38:58Z) - Distributed Machine Learning and the Semblance of Trust [66.1227776348216]
Federated Learning (FL) allows the data owner to maintain data governance and perform model training locally without having to share their data.
FL and related techniques are often described as privacy-preserving.
We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind.
arXiv Detail & Related papers (2021-12-21T08:44:05Z) - An Automated Framework for Supporting Data-Governance Rule Compliance in
Decentralized MIMO Contexts [10.62414957574478]
Dr.Aid is a logic-based AI framework for automated compliance checking of data governance rules over data-flow graphs.
Dr.Aid models data rules and flow rules and checks compliance by reasoning about the propagation, combination, modification and application of data rules over the data flow graphs.
arXiv Detail & Related papers (2021-09-02T10:53:03Z) - Representative & Fair Synthetic Data [68.8204255655161]
We present a framework to incorporate fairness constraints into the self-supervised learning process.
We generate a representative as well as fair version of the UCI Adult census data set.
We consider representative & fair synthetic data a promising future building block to teach algorithms not on historic worlds, but rather on the worlds that we strive to live in.
arXiv Detail & Related papers (2021-04-07T09:19:46Z) - Learning Connectivity for Data Distribution in Robot Teams [96.39864514115136]
We propose a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN)
Our approach enables multi-agent algorithms based on global state information to function by ensuring it is available at each robot.
We train the distributed GNN communication policies via reinforcement learning using the average Age of Information as the reward function and show that it improves training stability compared to task-specific reward functions.
arXiv Detail & Related papers (2021-03-08T21:48:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.