Related papers: Federated Unlearning Model Recovery in Data with Skewed Label Distributions

Federated Unlearning Model Recovery in Data with Skewed Label Distributions

URL: http://arxiv.org/abs/2412.13466v2
Date: Fri, 20 Dec 2024 11:03:45 GMT
Title: Federated Unlearning Model Recovery in Data with Skewed Label Distributions
Authors: Xinrui Yu, Wenbin Pei, Bing Xue, Qiang Zhang,
Abstract summary: This paper proposes a recovery method of federated unlearning with skewed label distributions.<n>We first adopt a strategy that incorporates oversampling with deep learning to supplement the skewed class data.<n>Then, a density-based denoising method is applied to remove noise from the generated data.<n>All the remaining clients leverage the enhanced local datasets and engage in iterative training to effectively restore the performance of the unlearning model.
Score: 10.236494861079779
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In federated learning, federated unlearning is a technique that provides clients with a rollback mechanism that allows them to withdraw their data contribution without training from scratch. However, existing research has not considered scenarios with skewed label distributions. Unfortunately, the unlearning of a client with skewed data usually results in biased models and makes it difficult to deliver high-quality service, complicating the recovery process. This paper proposes a recovery method of federated unlearning with skewed label distributions. Specifically, we first adopt a strategy that incorporates oversampling with deep learning to supplement the skewed class data for clients to perform recovery training, therefore enhancing the completeness of their local datasets. Afterward, a density-based denoising method is applied to remove noise from the generated data, further improving the quality of the remaining clients' datasets. Finally, all the remaining clients leverage the enhanced local datasets and engage in iterative training to effectively restore the performance of the unlearning model. Extensive evaluations on commonly used federated learning datasets with varying degrees of skewness show that our method outperforms baseline methods in restoring the performance of the unlearning model, particularly regarding accuracy on the skewed class.

Related papers

Lightweight Unsupervised Federated Learning with Pretrained Vision Language Model [32.094290282897894]
Federated learning aims to train a collective model from physically isolated clients while safeguarding the privacy of users' data. We propose a novel lightweight unsupervised federated learning approach that leverages unlabeled data on each client to perform lightweight model training and communication. Our proposed method greatly enhances model performance in comparison to CLIP's zero-shot predictions and even outperforms supervised federated learning benchmark methods.
arXiv Detail & Related papers (2024-04-17T03:42:48Z)
Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods. We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z)
Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data. Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets. We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries. Experimental results show that our method improves consistently over existing methods. Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z)
One-Shot Federated Learning with Classifier-Guided Diffusion Models [44.604485649167216]
One-shot federated learning (OSFL) has gained attention in recent years due to its low communication cost. In this paper, we explore the novel opportunities that diffusion models bring to OSFL and propose FedCADO. FedCADO generates data that complies with clients' distributions and subsequently training the aggregated model on the server.
arXiv Detail & Related papers (2023-11-15T11:11:25Z)
REPA: Client Clustering without Training and Data Labels for Improved Federated Learning in Non-IID Settings [1.69188400758521]
We present REPA, an approach to client clustering in non-IID FL settings that requires neither training nor labeled data collection. REPA uses a novel supervised autoencoder-based method to create embeddings that profile a client's underlying data-generating processes without exposing the data to the server and without requiring local training.
arXiv Detail & Related papers (2023-09-25T12:30:43Z)
Stabilizing and Improving Federated Learning with Non-IID Data and Client Dropout [15.569507252445144]
Label distribution skew induced data heterogeniety has been shown to be a significant obstacle that limits the model performance in federated learning. We propose a simple yet effective framework by introducing a prior-calibrated softmax function for computing the cross-entropy loss. The improved model performance over existing baselines in the presence of non-IID data and client dropout is demonstrated.
arXiv Detail & Related papers (2023-03-11T05:17:59Z)
Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning [60.41501515192088]
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively. The data samples usually follow a long-tailed distribution in the real world, and FL on the decentralized and long-tailed data yields a poorly-behaved global model. In this work, we integrate the local real data with the global gradient prototypes to form the local balanced datasets.
arXiv Detail & Related papers (2023-01-25T03:18:10Z)
Uncertainty Minimization for Personalized Federated Semi-Supervised Learning [15.123493340717303]
We propose a novel semi-supervised learning paradigm which allows partial-labeled or unlabeled clients to seek labeling assistance from data-related clients (helper agents) Experiments show that our proposed method can obtain superior performance and more stable convergence than other related works with partial labeled data.
arXiv Detail & Related papers (2022-05-05T04:41:27Z)
Class-Aware Contrastive Semi-Supervised Learning [51.205844705156046]
We propose a general method named Class-aware Contrastive Semi-Supervised Learning (CCSSL) to improve pseudo-label quality and enhance the model's robustness in the real-world setting. Our proposed CCSSL has significant performance improvements over the state-of-the-art SSL methods on the standard datasets CIFAR100 and STL10.
arXiv Detail & Related papers (2022-03-04T12:18:23Z)
Auto-weighted Robust Federated Learning with Corrupted Data Sources [7.475348174281237]
Federated learning provides a communication-efficient and privacy-preserving training process. Standard federated learning techniques that naively minimize an average loss function are vulnerable to data corruptions. We propose Auto-weighted Robust Federated Learning (arfl) to provide robustness against corrupted data sources.
arXiv Detail & Related papers (2021-01-14T21:54:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.