Related papers: Distributed Machine Learning and the Semblance of Trust

Distributed Machine Learning and the Semblance of Trust

URL: http://arxiv.org/abs/2112.11040v1
Date: Tue, 21 Dec 2021 08:44:05 GMT
Title: Distributed Machine Learning and the Semblance of Trust
Authors: Dmitrii Usynin, Alexander Ziller, Daniel Rueckert, Jonathan Passerat-Palmbach, Georgios Kaissis
Abstract summary: Federated Learning (FL) allows the data owner to maintain data governance and perform model training locally without having to share their data. FL and related techniques are often described as privacy-preserving. We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind.
Score: 66.1227776348216
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The utilisation of large and diverse datasets for machine learning (ML) at scale is required to promote scientific insight into many meaningful problems. However, due to data governance regulations such as GDPR as well as ethical concerns, the aggregation of personal and sensitive data is problematic, which prompted the development of alternative strategies such as distributed ML (DML). Techniques such as Federated Learning (FL) allow the data owner to maintain data governance and perform model training locally without having to share their data. FL and related techniques are often described as privacy-preserving. We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind. We further provide recommendations and examples on how such algorithms can be augmented to provide guarantees of governance, security, privacy and verifiability for a general ML audience without prior exposure to formal privacy techniques.

Related papers

Privacy-Preserving Federated Embedding Learning for Localized Retrieval-Augmented Generation [60.81109086640437]
We propose a novel framework called Federated Retrieval-Augmented Generation (FedE4RAG) FedE4RAG facilitates collaborative training of client-side RAG retrieval models. We apply homomorphic encryption within federated learning to safeguard model parameters.
arXiv Detail & Related papers (2025-04-27T04:26:02Z)
Privacy-Preserving Customer Support: A Framework for Secure and Scalable Interactions [0.0]
This paper introduces the Privacy-Preserving Zero-Shot Learning (PP-ZSL) framework, a novel approach leveraging large language models (LLMs) in a zero-shot learning mode. Unlike conventional machine learning methods, PP-ZSL eliminates the need for local training on sensitive data by utilizing pre-trained LLMs to generate responses directly. The framework incorporates real-time data anonymization to redact or mask sensitive information, retrieval-augmented generation (RAG) for domain-specific query resolution, and robust post-processing to ensure compliance with regulatory standards.
arXiv Detail & Related papers (2024-12-10T17:20:47Z)
Privacy in Federated Learning [0.0]
Federated Learning (FL) represents a significant advancement in distributed machine learning. This chapter delves into the core privacy concerns within FL, including the risks of data reconstruction, model inversion attacks, and membership inference. It examines the trade-offs between model accuracy and privacy, emphasizing the importance of balancing these factors in practical implementations.
arXiv Detail & Related papers (2024-08-12T18:41:58Z)
Collection, usage and privacy of mobility data in the enterprise and public administrations [55.2480439325792]
Security measures such as anonymization are needed to protect individuals' privacy. Within our study, we conducted expert interviews to gain insights into practices in the field. We survey privacy-enhancing methods in use, which generally do not comply with state-of-the-art standards of differential privacy.
arXiv Detail & Related papers (2024-07-04T08:29:27Z)
Ungeneralizable Examples [70.76487163068109]
Current approaches to creating unlearnable data involve incorporating small, specially designed noises. We extend the concept of unlearnable data to conditional data learnability and introduce textbfUntextbfGeneralizable textbfExamples (UGEs) UGEs exhibit learnability for authorized users while maintaining unlearnability for potential hackers.
arXiv Detail & Related papers (2024-04-22T09:29:14Z)
Data Collaboration Analysis Over Matrix Manifolds [0.0]
Privacy-Preserving Machine Learning (PPML) addresses this challenge by safeguarding sensitive information. NRI-DC framework emerges as an innovative approach, potentially resolving the 'data island' issue among institutions. This study establishes a rigorous theoretical foundation for these collaboration functions and introduces new formulations.
arXiv Detail & Related papers (2024-03-05T08:52:16Z)
State-of-the-Art Approaches to Enhancing Privacy Preservation of Machine Learning Datasets: A Survey [0.0]
This paper examines the evolving landscape of machine learning (ML) and its profound impact across various sectors. It focuses on the emerging field of Privacy-preserving Machine Learning (PPML) As ML applications become increasingly integral to industries like telecommunications, financial technology, and surveillance, they raise significant privacy concerns.
arXiv Detail & Related papers (2024-02-25T17:31:06Z)
PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind) Our work offers a theoretical analysis for model design and benchmarks various techniques. In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z)
A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing. Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data. Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z)
UFed-GAN: A Secure Federated Learning Framework with Constrained Computation and Unlabeled Data [50.13595312140533]
We propose a novel framework of UFed-GAN: Unsupervised Federated Generative Adversarial Network, which can capture user-side data distribution without local classification training. Our experimental results demonstrate the strong potential of UFed-GAN in addressing limited computational resources and unlabeled data while preserving privacy.
arXiv Detail & Related papers (2023-08-10T22:52:13Z)
Offline Reinforcement Learning with Differential Privacy [16.871660060209674]
offline reinforcement learning problem is often motivated by the need to learn data-driven decision policies in financial, legal and healthcare applications. We design offline RL algorithms with differential privacy guarantees which provably prevent such risks.
arXiv Detail & Related papers (2022-06-02T00:45:04Z)
Privacy Preservation in Federated Learning: An insightful survey from the GDPR Perspective [10.901568085406753]
Article is dedicated to surveying on the state-of-the-art privacy techniques, which can be employed in Federated learning. Recent research has demonstrated that retaining data and on computation in FL is not enough for privacy-guarantee. This is because ML model parameters exchanged between parties in an FL system, which can be exploited in some privacy attacks.
arXiv Detail & Related papers (2020-11-10T21:41:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.