Uncertainty-Based Extensible Codebook for Discrete Federated Learning in Heterogeneous Data Silos
- URL: http://arxiv.org/abs/2402.18888v4
- Date: Thu, 05 Jun 2025 04:46:31 GMT
- Title: Uncertainty-Based Extensible Codebook for Discrete Federated Learning in Heterogeneous Data Silos
- Authors: Tianyi Zhang, Yu Cao, Dianbo Liu,
- Abstract summary: Federated learning, aimed at leveraging vast distributed datasets, confronts a crucial challenge: the heterogeneity of data across different silos.<n>We propose an innovative yet straightforward iterative framework, termed emphUncertainty-Based Extensible-Codebook Federated Learning (UEFL).<n>This framework dynamically maps latent features to trainable discrete vectors, assesses the uncertainty, and specifically extends the discretization dictionary or codebook for silos exhibiting high uncertainty.
- Score: 11.443755718706562
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning (FL), aimed at leveraging vast distributed datasets, confronts a crucial challenge: the heterogeneity of data across different silos. While previous studies have explored discrete representations to enhance model generalization across minor distributional shifts, these approaches often struggle to adapt to new data silos with significantly divergent distributions. In response, we have identified that models derived from FL exhibit markedly increased uncertainty when applied to data silos with unfamiliar distributions. Consequently, we propose an innovative yet straightforward iterative framework, termed \emph{Uncertainty-Based Extensible-Codebook Federated Learning (UEFL)}. This framework dynamically maps latent features to trainable discrete vectors, assesses the uncertainty, and specifically extends the discretization dictionary or codebook for silos exhibiting high uncertainty. Our approach aims to simultaneously enhance accuracy and reduce uncertainty by explicitly addressing the diversity of data distributions, all while maintaining minimal computational overhead in environments characterized by heterogeneous data silos. Extensive experiments across multiple datasets demonstrate that UEFL outperforms state-of-the-art methods, achieving significant improvements in accuracy (by 3\%--22.1\%) and uncertainty reduction (by 38.83\%--96.24\%). The source code is available at https://github.com/destiny301/uefl.
Related papers
- Federated Loss Exploration for Improved Convergence on Non-IID Data [20.979550470097823]
Federated Loss Exploration (FedLEx) is an innovative approach specifically designed to tackle these challenges.<n>FedLEx distinctively addresses the shortcomings of existing FL methods in non-IID settings.<n>Our experiments with state-of-the art FL algorithms demonstrate significant improvements in performance.
arXiv Detail & Related papers (2025-06-23T13:42:07Z) - Learnable Sparse Customization in Heterogeneous Edge Computing [27.201987866208484]
We propose Learnable Personalized Sparsification for heterogeneous Federated learning (FedLPS)
FedLPS learns the importance of model units on local data representation and derives an importance-based sparse pattern with minimals to accurately extract personalized data features.
Experiments show that FedLPS outperforms status quo approaches in accuracy and training costs.
arXiv Detail & Related papers (2024-12-10T06:14:31Z) - Uncertainty Quantification via Hölder Divergence for Multi-View Representation Learning [18.419742575630217]
This paper introduces a novel algorithm based on H"older Divergence (HD) to enhance the reliability of multi-view learning.
Through the Dempster-Shafer theory, integration of uncertainty from different modalities, thereby generating a comprehensive result.
Mathematically, HD proves to better measure the distance'' between real data distribution and predictive distribution of the model.
arXiv Detail & Related papers (2024-10-29T04:29:44Z) - Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition [50.61991746981703]
Current state-of-the-art LTSSL approaches rely on high-quality pseudo-labels for large-scale unlabeled data.
This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning.
We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels.
arXiv Detail & Related papers (2024-10-08T15:06:10Z) - The Risk of Federated Learning to Skew Fine-Tuning Features and
Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model.
We introduce three robustness indicators and conduct experiments across diverse robust datasets.
Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z) - FedSym: Unleashing the Power of Entropy for Benchmarking the Algorithms
for Federated Learning [1.4656078321003647]
Federated learning (FL) is a decentralized machine learning approach where independent learners process data privately.
We study the currently popular data partitioning techniques and visualize their main disadvantages.
We propose a method that leverages entropy and symmetry to construct 'the most challenging' and controllable data distributions.
arXiv Detail & Related papers (2023-10-11T18:39:08Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - FedGen: Generalizable Federated Learning for Sequential Data [8.784435748969806]
In many real-world distributed settings, spurious correlations exist due to biases and data sampling issues.
We present a generalizable federated learning framework called FedGen, which allows clients to identify and distinguish between spurious and invariant features.
We show that FedGen results in models that achieve significantly better generalization and can outperform the accuracy of current federated learning approaches by over 24%.
arXiv Detail & Related papers (2022-11-03T15:48:14Z) - Depersonalized Federated Learning: Tackling Statistical Heterogeneity by
Alternating Stochastic Gradient Descent [6.394263208820851]
Federated learning (FL) enables devices to train a common machine learning (ML) model for intelligent inference without data sharing.
Raw data held by various cooperativelyicipators are always non-identically distributedly.
We propose a new FL that can significantly statistical optimize by the de-speed of this process.
arXiv Detail & Related papers (2022-10-07T10:30:39Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Distributionally Robust Learning with Stable Adversarial Training [34.74504615726101]
Machine learning algorithms with empirical risk minimization are vulnerable under distributional shifts.
We propose a novel Stable Adversarial Learning (SAL) algorithm that leverages heterogeneous data sources to construct a more practical uncertainty set.
arXiv Detail & Related papers (2021-06-30T03:05:45Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Heteroskedastic and Imbalanced Deep Learning with Adaptive
Regularization [55.278153228758434]
Real-world datasets are heteroskedastic and imbalanced.
Addressing heteroskedasticity and imbalance simultaneously is under-explored.
We propose a data-dependent regularization technique for heteroskedastic datasets.
arXiv Detail & Related papers (2020-06-29T01:09:50Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Stable Adversarial Learning under Distributional Shifts [46.98655899839784]
Machine learning algorithms with empirical risk minimization are vulnerable under distributional shifts.
We propose Stable Adversarial Learning (SAL) algorithm that leverages heterogeneous data sources to construct a more practical uncertainty set.
arXiv Detail & Related papers (2020-06-08T08:42:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.