Mapping and Comparing Data Governance Frameworks: A benchmarking
exercise to inform global data governance deliberations
- URL: http://arxiv.org/abs/2302.13731v1
- Date: Mon, 27 Feb 2023 12:56:25 GMT
- Title: Mapping and Comparing Data Governance Frameworks: A benchmarking
exercise to inform global data governance deliberations
- Authors: Sara Marcucci, Natalia Gonzalez Alarcon, Stefaan G. Verhulst, and
Elena Wullhorst
- Abstract summary: Article explores the increasing importance of global data governance due to the rapid growth of data and the need for responsible data use and protection.
The report highlights the need for a more holistic, coordinated approach to data governance to manage the global flow of data responsibly and for the public interest.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Data has become a critical resource for organizations and society. Yet, it is
not always as valuable as it could be since there is no well-defined approach
to managing and using it. This article explores the increasing importance of
global data governance due to the rapid growth of data and the need for
responsible data use and protection. While historically associated with private
organizational governance, data governance has evolved to include governmental
and institutional bodies. However, the lack of a global consensus and
fragmentation in policies and practices pose challenges to the development of a
common framework. The purpose of this report is to compare approaches and
identify patterns in the emergent and fragmented data governance ecosystem
within sectors close to the international development field, ultimately
presenting key takeaways and reflections on when and why a global data
governance framework may be needed. Overall, the report highlights the need for
a more holistic, coordinated transnational approach to data governance to
manage the global flow of data responsibly and for the public interest. The
article begins by giving an overview of the current fragmented data governance
ecology, to then proceed to illustrate the methodology used. Subsequently, the
paper illustrates the most relevant findings stemming from the research. These
are organized according to six key elements: (a) purpose, (b) principles, (c)
anchoring documents, (d) data description and lifecycle, (e) processes, and (f)
practices. Finally, the article closes with a series of key takeaways and final
reflections.
Related papers
- A Systematic Review of NeurIPS Dataset Management Practices [7.974245534539289]
We present a systematic review of datasets published at the NeurIPS track, focusing on four key aspects: provenance, distribution, ethical disclosure, and licensing.
Our findings reveal that dataset provenance is often unclear due to ambiguous filtering and curation processes.
These inconsistencies underscore the urgent need for standardized data infrastructures for the publication and management of datasets.
arXiv Detail & Related papers (2024-10-31T23:55:41Z) - Human-Data Interaction Framework: A Comprehensive Model for a Future Driven by Data and Humans [0.0]
The Human-Data Interaction (HDI) framework has become an essential approach to tackling the challenges and ethical issues associated with data governance and utilization in the modern digital world.
This paper outlines the fundamental steps required for organizations to seamlessly integrate HDI principles.
arXiv Detail & Related papers (2024-07-30T17:57:09Z) - Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms [56.119374302685934]
There have been severe concerns over the trustworthiness of AI technologies.
Machine and deep learning algorithms depend heavily on the data used during their development.
We propose a framework to evaluate the datasets through a responsible rubric.
arXiv Detail & Related papers (2023-10-24T14:01:53Z) - Semantic Modelling of Organizational Knowledge as a Basis for Enterprise
Data Governance 4.0 -- Application to a Unified Clinical Data Model [6.302916372143144]
We establish a simple, cost-efficient framework that enables metadata-driven, agile and (semi-automated) data governance.
We explain how we implement and use this framework to integrate 25 years of clinical study data at an enterprise scale in a fully productive environment.
arXiv Detail & Related papers (2023-10-20T19:36:03Z) - Auditing and Generating Synthetic Data with Controllable Trust Trade-offs [54.262044436203965]
We introduce a holistic auditing framework that comprehensively evaluates synthetic datasets and AI models.
It focuses on preventing bias and discrimination, ensures fidelity to the source data, assesses utility, robustness, and privacy preservation.
We demonstrate the framework's effectiveness by auditing various generative models across diverse use cases.
arXiv Detail & Related papers (2023-04-21T09:03:18Z) - Data Governance in the Age of Large-Scale Data-Driven Language
Technology [79.92626780294258]
This work proposes an approach to global language data governance that attempts to organize data management amongst stakeholders, values, and rights.
The framework we present is a multi-party international governance structure focused on language data, and incorporating technical and organizational tools needed to support its work.
arXiv Detail & Related papers (2022-05-04T00:44:35Z) - Explainable Patterns: Going from Findings to Insights to Support Data
Analytics Democratization [60.18814584837969]
We present Explainable Patterns (ExPatt), a new framework to support lay users in exploring and creating data storytellings.
ExPatt automatically generates plausible explanations for observed or selected findings using an external (textual) source of information.
arXiv Detail & Related papers (2021-01-19T16:13:44Z) - Second layer data governance for permissioned blockchains: the privacy
management challenge [58.720142291102135]
In pandemic situations, such as the COVID-19 and Ebola outbreak, the action related to sharing health data is crucial to avoid the massive infection and decrease the number of deaths.
In this sense, permissioned blockchain technology emerges to empower users to get their rights providing data ownership, transparency, and security through an immutable, unified, and distributed database ruled by smart contracts.
arXiv Detail & Related papers (2020-10-22T13:19:38Z) - Knowledge Scientists: Unlocking the data-driven organization [5.05432938384774]
We argue that the technologies for reliable data are driven by distinct concerns and expertise.
Those organizations which identify the central importance of meaningful, explainable, reproducible, and maintainable data will be at the forefront of the democratization of reliable data.
arXiv Detail & Related papers (2020-04-16T20:14:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.