Data Distribution Shifts in (Industrial) Federated Learning as a Privacy Issue
- URL: http://arxiv.org/abs/2409.13875v1
- Date: Fri, 20 Sep 2024 20:09:19 GMT
- Title: Data Distribution Shifts in (Industrial) Federated Learning as a Privacy Issue
- Authors: David Brunner, Alessio Montuoro,
- Abstract summary: We consider industrial federated learning, a collaboration between a small number of powerful, potentially competing industrial players.
We argue that this configuration harbours covert privacy risks that do not arise in e.g. cross-device settings.
We show an honest-but-curious attacker to be capable of detecting subtle distributional shifts on other clients.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider industrial federated learning, a collaboration between a small number of powerful, potentially competing industrial players, mediated by a third party aspiring to improve the service it provides to its customers. We argue that this configuration harbours covert privacy risks that do not arise in e.g. cross-device settings. Companies are very protective of their intellectual property and production processes. Information about changes to their production and the timing of which is to be kept private. We study a scenario in which one of the collaborators infers changes to their competitors' production by detecting potentially subtle temporal data distribution shifts. In this framing, a data distribution shift is always problematic, even if it has no negative effect on training convergence. Thus, our goal is to find means that allow the detection of distributional shifts better than customary evaluation metrics. Based on the assumption that even minor shifts translate into the collaboratively learned machine learning model, the attacker tracks the shared models' internal state with a selection of metrics from literature in order to pick up on relevant changes. In an empirical study on benchmark datasets, we show an honest-but-curious attacker to be capable of detecting subtle distributional shifts on other clients, in some cases long before they become obvious in evaluation.
Related papers
- Client-side Gradient Inversion Against Federated Learning from Poisoning [59.74484221875662]
Federated Learning (FL) enables distributed participants to train a global model without sharing data directly to a central server.
Recent studies have revealed that FL is vulnerable to gradient inversion attack (GIA), which aims to reconstruct the original training samples.
We propose Client-side poisoning Gradient Inversion (CGI), which is a novel attack method that can be launched from clients.
arXiv Detail & Related papers (2023-09-14T03:48:27Z) - Masked Autoencoders are Efficient Continual Federated Learners [20.856520787551453]
Continual learning should be grounded in unsupervised learning of representations that are shared across clients.
Masked autoencoders for distribution estimation are particularly amenable to this setup.
arXiv Detail & Related papers (2023-06-06T09:38:57Z) - Even Small Correlation and Diversity Shifts Pose Dataset-Bias Issues [19.4921353136871]
We study two types of distribution shifts: diversity shifts, which occur when test samples exhibit patterns unseen during training, and correlation shifts, which occur when test data present a different correlation between seen invariant and spurious features.
We propose an integrated protocol to analyze both types of shifts using datasets where they co-exist in a controllable manner.
arXiv Detail & Related papers (2023-05-09T23:40:23Z) - When Do Curricula Work in Federated Learning? [56.88941905240137]
We find that curriculum learning largely alleviates non-IIDness.
The more disparate the data distributions across clients the more they benefit from learning.
We propose a novel client selection technique that benefits from the real-world disparity in the clients.
arXiv Detail & Related papers (2022-12-24T11:02:35Z) - Vicious Classifiers: Assessing Inference-time Data Reconstruction Risk in Edge Computing [2.2636351333487315]
Privacy-preserving inference in edge computing encourages the users of machine-learning services to locally run a model on their private input.
We study how a vicious server can reconstruct the input data by observing only the models outputs.
We present a new measure to assess the inference-time reconstruction risk.
arXiv Detail & Related papers (2022-12-08T12:05:50Z) - Study of the performance and scalability of federated learning for
medical imaging with intermittent clients [0.0]
Federated learning is a data decentralization privacy-preserving technique used to perform machine or deep learning in a secure way.
Use case of medical image analysis is proposed, using chest X-ray images obtained from an open data repository.
Different clients will be simulated from the training data, selected in an unbalanced manner, they do not all have the same number of data.
Two approaches to follow will be analyzed in the case of intermittent clients, as in a real scenario some clients may leave the training, and some new ones may enter the training.
arXiv Detail & Related papers (2022-07-18T13:18:34Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - Learning from Heterogeneous Data Based on Social Interactions over
Graphs [58.34060409467834]
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
arXiv Detail & Related papers (2021-12-17T12:47:18Z) - Tracking the risk of a deployed model and detecting harmful distribution
shifts [105.27463615756733]
In practice, it may make sense to ignore benign shifts, under which the performance of a deployed model does not degrade substantially.
We argue that a sensible method for firing off a warning has to both (a) detect harmful shifts while ignoring benign ones, and (b) allow continuous monitoring of model performance without increasing the false alarm rate.
arXiv Detail & Related papers (2021-10-12T17:21:41Z) - Understanding Clipping for Federated Learning: Convergence and
Client-Level Differential Privacy [67.4471689755097]
This paper empirically demonstrates that the clipped FedAvg can perform surprisingly well even with substantial data heterogeneity.
We provide the convergence analysis of a differential private (DP) FedAvg algorithm and highlight the relationship between clipping bias and the distribution of the clients' updates.
arXiv Detail & Related papers (2021-06-25T14:47:19Z) - Robust Fairness under Covariate Shift [11.151913007808927]
Making predictions that are fair with regard to protected group membership has become an important requirement for classification algorithms.
We propose an approach that obtains the predictor that is robust to the worst-case in terms of target performance.
arXiv Detail & Related papers (2020-10-11T04:42:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.