Lightweight and Robust Federated Data Valuation
- URL: http://arxiv.org/abs/2509.25560v1
- Date: Mon, 29 Sep 2025 22:31:56 GMT
- Title: Lightweight and Robust Federated Data Valuation
- Authors: Guojun Tang, Jiayu Zhou, Mohammad Mamun, Steve Drew,
- Abstract summary: Federated learning (FL) faces persistent challenges due to non-IID data distributions and adversarial client behavior.<n>We propose FedIF, a novel FL aggregation framework that leverages trajectory-based influence estimation to efficiently compute client contributions.<n>Our results establish FedIF as a practical, theoretically grounded, and scalable alternative to Shapley-value-based approaches for efficient and robust FL in real-world deployments.
- Score: 24.16107496848504
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning (FL) faces persistent robustness challenges due to non-IID data distributions and adversarial client behavior. A promising mitigation strategy is contribution evaluation, which enables adaptive aggregation by quantifying each client's utility to the global model. However, state-of-the-art Shapley-value-based approaches incur high computational overhead due to repeated model reweighting and inference, which limits their scalability. We propose FedIF, a novel FL aggregation framework that leverages trajectory-based influence estimation to efficiently compute client contributions. FedIF adapts decentralized FL by introducing normalized and smoothed influence scores computed from lightweight gradient operations on client updates and a public validation set. Theoretical analysis demonstrates that FedIF yields a tighter bound on one-step global loss change under noisy conditions. Extensive experiments on CIFAR-10 and Fashion-MNIST show that FedIF achieves robustness comparable to or exceeding SV-based methods in the presence of label noise, gradient noise, and adversarial samples, while reducing aggregation overhead by up to 450x. Ablation studies confirm the effectiveness of FedIF's design choices, including local weight normalization and influence smoothing. Our results establish FedIF as a practical, theoretically grounded, and scalable alternative to Shapley-value-based approaches for efficient and robust FL in real-world deployments.
Related papers
- Adaptive Dual-Weighting Framework for Federated Learning via Out-of-Distribution Detection [53.45696787935487]
Federated Learning (FL) enables collaborative model training across large-scale distributed service nodes.<n>In real-world service-oriented deployments, data generated by heterogeneous users, devices, and application scenarios are inherently non-IID.<n>We propose FLood, a novel FL framework inspired by out-of-distribution (OOD) detection.
arXiv Detail & Related papers (2026-02-01T05:54:59Z) - CO-PFL: Contribution-Oriented Personalized Federated Learning for Heterogeneous Networks [51.43780477302533]
Contribution-Oriented PFL (CO-PFL) is a novel algorithm that dynamically estimates each client's contribution for global aggregation.<n>CO-PFL consistently surpasses state-of-the-art methods in robustness in personalization accuracy, robustness, scalability and convergence stability.
arXiv Detail & Related papers (2025-10-23T05:10:06Z) - FedReFT: Federated Representation Fine-Tuning with All-But-Me Aggregation [12.544628972135905]
We introduce Federated Representation Fine-Tuning (FedReFT), a novel approach to fine-tune the client's hidden representation.<n>FedReFT applies sparse intervention layers to steer hidden representations directly, offering a lightweight and semantically rich fine-tuning alternative.<n>We evaluate FedReFT on commonsense reasoning, arithmetic reasoning, instruction-tuning, and GLUE.
arXiv Detail & Related papers (2025-08-27T22:03:19Z) - FlexFed: Mitigating Catastrophic Forgetting in Heterogeneous Federated Learning in Pervasive Computing Environments [4.358456799125694]
Pervasive computing environments (e.g., for Human Activity Recognition, HAR) are characterized by resource-constrained end devices, streaming sensor data and intermittent client participation.<n>We propose FlexFed, a novel FL approach that prioritizes data retention for efficient memory use and dynamically adjusts offline training frequency.<n>We also develop a realistic HAR-based evaluation framework that simulates streaming data, dynamic distributions, imbalances and varying availability.
arXiv Detail & Related papers (2025-05-19T14:23:37Z) - FedEFC: Federated Learning Using Enhanced Forward Correction Against Noisy Labels [2.8547732086436306]
Federated Learning (FL) is a powerful framework for privacy-preserving distributed learning.<n> handling noisy labels in FL remains a major challenge due to heterogeneous data distributions and communication constraints.<n>We propose FedEFC, a novel method designed to tackle the impact of noisy labels in FL.
arXiv Detail & Related papers (2025-04-08T02:14:50Z) - FedPCA: Noise-Robust Fair Federated Learning via Performance-Capacity Analysis [39.424995330773264]
FedPCA identifies mislabeled clients via a Gaussian Mixture Model on loss-dispersion pairs.<n>It applies fairness and robustness strategies in global aggregation and local training by adjusting client weights and selectively using reliable data.
arXiv Detail & Related papers (2025-03-13T17:18:18Z) - Feasible Learning [78.6167929413604]
We introduce Feasible Learning (FL), a sample-centric learning paradigm where models are trained by solving a feasibility problem that bounds the loss for each training sample.<n>Our empirical analysis, spanning image classification, age regression, and preference optimization in large language models, demonstrates that models trained via FL can learn from data while displaying improved tail behavior compared to ERM, with only a marginal impact on average performance.
arXiv Detail & Related papers (2025-01-24T20:39:38Z) - Over-the-Air Fair Federated Learning via Multi-Objective Optimization [52.295563400314094]
We propose an over-the-air fair federated learning algorithm (OTA-FFL) to train fair FL models.<n>Experiments demonstrate the superiority of OTA-FFL in achieving fairness and robust performance.
arXiv Detail & Related papers (2025-01-06T21:16:51Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Fine-tuning Global Model via Data-Free Knowledge Distillation for
Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint.
We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG)
Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z) - FedPrune: Towards Inclusive Federated Learning [1.308951527147782]
Federated learning (FL) is a distributed learning technique that trains a shared model over distributed data in a privacy-preserving manner.
We propose FedPrune; a system that tackles this challenge by pruning the global model for slow clients based on their device characteristics.
By using insights from Central Limit Theorem, FedPrune incorporates a new aggregation technique that achieves robust performance over non-IID data.
arXiv Detail & Related papers (2021-10-27T06:33:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.