Honest Computing: Achieving demonstrable data lineage and provenance for driving data and process-sensitive policies
- URL: http://arxiv.org/abs/2407.14390v1
- Date: Fri, 19 Jul 2024 15:13:42 GMT
- Title: Honest Computing: Achieving demonstrable data lineage and provenance for driving data and process-sensitive policies
- Authors: Florian Guitton, Axel Oehmichen, Étienne Bossé, Yike Guo,
- Abstract summary: Data is susceptible to undue disclosures, leaks, losses, manipulation, or fabrication.
We introduce the concept of Honest Computing as the practice and approach that emphasizes transparency, integrity, and ethical behaviour.
This foundational layer approach can help define new standards for appropriate data custody and processing.
- Score: 8.67097489372345
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Data is the foundation of any scientific, industrial or commercial process. Its journey typically flows from collection to transport, storage, management and processing. While best practices and regulations guide data management and protection, recent events have underscored its vulnerability. Academic research and commercial data handling have been marred by scandals, revealing the brittleness of data management. Data, despite its importance, is susceptible to undue disclosures, leaks, losses, manipulation, or fabrication. These incidents often occur without visibility or accountability, necessitating a systematic structure for safe, honest, and auditable data management. In this paper, we introduce the concept of Honest Computing as the practice and approach that emphasizes transparency, integrity, and ethical behaviour within the realm of computing and technology. It ensures that computer systems and software operate honestly and reliably without hidden agendas, biases, or unethical practices. It enables privacy and confidentiality of data and code by design and by default. We also introduce a reference framework to achieve demonstrable data lineage and provenance, contrasting it with Secure Computing, a related but differently-orientated form of computing. At its core, Honest Computing leverages Trustless Computing, Confidential Computing, Distributed Computing, Cryptography and AAA security concepts. Honest Computing opens new ways of creating technology-based processes and workflows which permit the migration of regulatory frameworks for data protection from principle-based approaches to rule-based ones. Addressing use cases in many fields, from AI model protection and ethical layering to digital currency formation for finance and banking, trading, and healthcare, this foundational layer approach can help define new standards for appropriate data custody and processing.
Related papers
- Secure Computation and Trustless Data Intermediaries in Data Spaces [0.44998333629984877]
This paper explores the integration of advanced cryptographic techniques for secure computation in data spaces.
We exploit the introduced secure methods, i.e. Secure Multi-Party Computation (MPC) and Fully Homomorphic Encryption (FHE)
We present solutions through real-world use cases, including air traffic management, manufacturing, and secondary data use.
arXiv Detail & Related papers (2024-10-21T19:10:53Z) - Human-Data Interaction Framework: A Comprehensive Model for a Future Driven by Data and Humans [0.0]
The Human-Data Interaction (HDI) framework has become an essential approach to tackling the challenges and ethical issues associated with data governance and utilization in the modern digital world.
This paper outlines the fundamental steps required for organizations to seamlessly integrate HDI principles.
arXiv Detail & Related papers (2024-07-30T17:57:09Z) - Auditing and Generating Synthetic Data with Controllable Trust Trade-offs [54.262044436203965]
We introduce a holistic auditing framework that comprehensively evaluates synthetic datasets and AI models.
It focuses on preventing bias and discrimination, ensures fidelity to the source data, assesses utility, robustness, and privacy preservation.
We demonstrate the framework's effectiveness by auditing various generative models across diverse use cases.
arXiv Detail & Related papers (2023-04-21T09:03:18Z) - Distributed Machine Learning and the Semblance of Trust [66.1227776348216]
Federated Learning (FL) allows the data owner to maintain data governance and perform model training locally without having to share their data.
FL and related techniques are often described as privacy-preserving.
We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind.
arXiv Detail & Related papers (2021-12-21T08:44:05Z) - AI Assurance using Causal Inference: Application to Public Policy [0.0]
Most AI approaches can only be represented as "black boxes" and suffer from the lack of transparency.
It is crucial not only to develop effective and robust AI systems, but to make sure their internal processes are explainable and fair.
arXiv Detail & Related papers (2021-12-01T16:03:06Z) - Representative & Fair Synthetic Data [68.8204255655161]
We present a framework to incorporate fairness constraints into the self-supervised learning process.
We generate a representative as well as fair version of the UCI Adult census data set.
We consider representative & fair synthetic data a promising future building block to teach algorithms not on historic worlds, but rather on the worlds that we strive to live in.
arXiv Detail & Related papers (2021-04-07T09:19:46Z) - Trustworthy Transparency by Design [57.67333075002697]
We propose a transparency framework for software design, incorporating research on user trust and experience.
Our framework enables developing software that incorporates transparency in its design.
arXiv Detail & Related papers (2021-03-19T12:34:01Z) - Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks,
and Defenses [150.64470864162556]
This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits.
In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
arXiv Detail & Related papers (2020-12-18T22:38:47Z) - Towards Compliant Data Management Systems for Healthcare ML [6.057289837472806]
We review how data flows within machine learning projects in healthcare from source to storage to use in training algorithms and beyond.
Our objective is to design tools to detect and track sensitive data across machines and users across the life cycle of a project.
We build a prototype of the solution that demonstrates the difficulties in this domain.
arXiv Detail & Related papers (2020-11-15T15:27:51Z) - Privacy Preservation in Federated Learning: An insightful survey from
the GDPR Perspective [10.901568085406753]
Article is dedicated to surveying on the state-of-the-art privacy techniques, which can be employed in Federated learning.
Recent research has demonstrated that retaining data and on computation in FL is not enough for privacy-guarantee.
This is because ML model parameters exchanged between parties in an FL system, which can be exploited in some privacy attacks.
arXiv Detail & Related papers (2020-11-10T21:41:25Z) - Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations.
We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.