Your Data, My Model: Learning Who Really Helps in Federated Learning
- URL: http://arxiv.org/abs/2409.02064v3
- Date: Wed, 28 May 2025 20:56:18 GMT
- Title: Your Data, My Model: Learning Who Really Helps in Federated Learning
- Authors: Shamsiiat Abdurakhmanova, Amirhossein Mohammadi, Yasmin SarcheshmehPour, Alexander Jung,
- Abstract summary: A key challenge is determining which peers are most beneficial for collaboration.<n>We propose a simple and privacy-preserving method to select relevant collaborators.<n>Our approach enables model-agnostic, data-driven peer selection for personalized federated learning.
- Score: 47.0304843350031
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many important machine learning applications involve networks of devices-such as wearables or smartphones-that generate local data and train personalized models. A key challenge is determining which peers are most beneficial for collaboration. We propose a simple and privacy-preserving method to select relevant collaborators by evaluating how much a model improves after a single gradient step using another devices data-without sharing raw data. This method naturally extends to non-parametric models by replacing the gradient step with a non-parametric generalization. Our approach enables model-agnostic, data-driven peer selection for personalized federated learning (PersFL).
Related papers
- Vertical Federated Unlearning via Backdoor Certification [15.042986414487922]
VFL offers a novel paradigm in machine learning, enabling distinct entities to train models cooperatively while maintaining data privacy.<n>Recent privacy regulations emphasize an individual's emphright to be forgotten, which necessitates the ability for models to unlearn specific training data.<n>We introduce an innovative modification to traditional VFL by employing a mechanism that inverts the typical learning trajectory with the objective of extracting specific data contributions.
arXiv Detail & Related papers (2024-12-16T06:40:25Z) - Generating Realistic Tabular Data with Large Language Models [49.03536886067729]
Large language models (LLM) have been used for diverse tasks, but do not capture the correct correlation between the features and the target variable.
We propose a LLM-based method with three important improvements to correctly capture the ground-truth feature-class correlation in the real data.
Our experiments show that our method significantly outperforms 10 SOTA baselines on 20 datasets in downstream tasks.
arXiv Detail & Related papers (2024-10-29T04:14:32Z) - Promoting Data and Model Privacy in Federated Learning through Quantized LoRA [41.81020951061438]
We introduce a method that just needs to distribute a quantized version of the model's parameters during training.
We combine this quantization strategy with LoRA, a popular and parameter-efficient fine-tuning method, to significantly reduce communication costs in federated learning.
The proposed framework, named textscFedLPP, successfully ensures both data and model privacy in the federated learning context.
arXiv Detail & Related papers (2024-06-16T15:23:07Z) - Generative Dataset Distillation: Balancing Global Structure and Local Details [49.20086587208214]
We propose a new dataset distillation method that considers balancing global structure and local details.
Our method involves using a conditional generative adversarial network to generate the distilled dataset.
arXiv Detail & Related papers (2024-04-26T23:46:10Z) - Learn What You Need in Personalized Federated Learning [53.83081622573734]
$textitLearn2pFed$ is a novel algorithm-unrolling-based personalized federated learning framework.
We show that $textitLearn2pFed$ significantly outperforms previous personalized federated learning methods.
arXiv Detail & Related papers (2024-01-16T12:45:15Z) - Personalized Federated Learning with Contextual Modulation and
Meta-Learning [2.7716102039510564]
Federated learning has emerged as a promising approach for training machine learning models on decentralized data sources.
We propose a novel framework that combines federated learning with meta-learning techniques to enhance both efficiency and generalization capabilities.
arXiv Detail & Related papers (2023-12-23T08:18:22Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Few-Shot Object Detection via Synthetic Features with Optimal Transport [28.072187044345107]
We propose a novel approach in which we train a generator to generate synthetic data for novel classes.
Our overarching goal is to train a generator that captures the data variations of the base dataset.
We then transform the captured variations into novel classes by generating synthetic data with the trained generator.
arXiv Detail & Related papers (2023-08-29T03:54:26Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Exploring Data Redundancy in Real-world Image Classification through
Data Selection [20.389636181891515]
Deep learning models often require large amounts of data for training, leading to increased costs.
We present two data valuation metrics based on Synaptic Intelligence and gradient norms, respectively, to study redundancy in real-world image data.
Online and offline data selection algorithms are then proposed via clustering and grouping based on the examined data values.
arXiv Detail & Related papers (2023-06-25T03:31:05Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [47.432215933099016]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.<n>This creates a barrier to fusing knowledge across individual models to yield a better single model.<n>We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Achieving Representative Data via Convex Hull Feasibility Sampling
Algorithms [35.29582673348303]
Sampling biases in training data are a major source of algorithmic biases in machine learning systems.
We present adaptive sampling methods to determine, with high confidence, whether it is possible to assemble a representative dataset from the given data sources.
arXiv Detail & Related papers (2022-04-13T23:14:05Z) - Uniform-in-Phase-Space Data Selection with Iterative Normalizing Flows [0.0]
A strategy is proposed to select data points such that they uniformly span the phase-space of the data.
An iterative method is used to accurately estimate the probability of the rare data points when only a small subset of the dataset is used to construct the probability map.
The proposed framework is demonstrated as a viable pathway to enable data-efficient machine learning when abundant data is available.
arXiv Detail & Related papers (2021-12-28T20:06:28Z) - A Personalized Federated Learning Algorithm: an Application in Anomaly
Detection [0.6700873164609007]
Federated Learning (FL) has recently emerged as a promising method to overcome data privacy and transmission issues.
In FL, datasets collected from different devices or sensors are used to train local models (clients) each of which shares its learning with a centralized model (server)
This paper proposes a novel Personalized FedAvg (PC-FedAvg) which aims to control weights communication and aggregation augmented with a tailored learning algorithm to personalize the resulting models at each client.
arXiv Detail & Related papers (2021-11-04T04:57:11Z) - A Single Example Can Improve Zero-Shot Data Generation [7.237231992155901]
Sub-tasks of intent classification require extensive and flexible datasets for experiments and evaluation.
We propose to use text generation methods to gather datasets.
We explore two approaches to generating task-oriented utterances.
arXiv Detail & Related papers (2021-08-16T09:43:26Z) - BREEDS: Benchmarks for Subpopulation Shift [98.90314444545204]
We develop a methodology for assessing the robustness of models to subpopulation shift.
We leverage the class structure underlying existing datasets to control the data subpopulations that comprise the training and test distributions.
Applying this methodology to the ImageNet dataset, we create a suite of subpopulation shift benchmarks of varying granularity.
arXiv Detail & Related papers (2020-08-11T17:04:47Z) - Multi-Center Federated Learning [62.57229809407692]
This paper proposes a novel multi-center aggregation mechanism for federated learning.
It learns multiple global models from the non-IID user data and simultaneously derives the optimal matching between users and centers.
Our experimental results on benchmark datasets show that our method outperforms several popular federated learning methods.
arXiv Detail & Related papers (2020-05-03T09:14:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.