On the Fragility of Contribution Score Computation in Federated Learning
- URL: http://arxiv.org/abs/2509.19921v2
- Date: Mon, 27 Oct 2025 06:25:33 GMT
- Title: On the Fragility of Contribution Score Computation in Federated Learning
- Authors: Balazs Pejo, Marcell Frank, Krisztian Varga, Peter Veliczky, Gergely Biczok,
- Abstract summary: We argue that contribution scores are susceptible to significant distortions from two fundamental perspectives.<n>First, we explore how different model aggregation methods impact these scores.<n>Second, we explore vulnerabilities posed by poisoning attacks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates the fragility of contribution evaluation in federated learning, a critical mechanism for ensuring fairness and incentivizing participation. We argue that contribution scores are susceptible to significant distortions from two fundamental perspectives: architectural sensitivity and intentional manipulation. First, we explore how different model aggregation methods impact these scores. While most research assumes a basic averaging approach, we demonstrate that advanced techniques, including those designed to handle unreliable or diverse clients, can unintentionally yet significantly alter the final scores. Second, we explore vulnerabilities posed by poisoning attacks, where malicious participants strategically manipulate their model updates to inflate their own contribution scores or reduce the importance of other participants. Through extensive experiments across diverse datasets and model architectures, implemented within the Flower framework, we rigorously show that both the choice of aggregation method and the presence of attackers are potent vectors for distorting contribution scores, highlighting a critical need for more robust evaluation schemes.
Related papers
- Adversarial Bias: Data Poisoning Attacks on Fairness [48.17618627431355]
There is relatively little research on how an AI system's fairness can be intentionally compromised.<n>In this work, we provide a theoretical analysis demonstrating that a simple adversarial poisoning strategy is sufficient to induce maximally unfair behavior.<n>Our attack significantly outperforms existing methods in degrading fairness metrics across multiple models and datasets.
arXiv Detail & Related papers (2025-11-11T15:09:53Z) - Does simple trump complex? Comparing strategies for adversarial robustness in DNNs [3.6130723421895947]
Deep Neural Networks (DNNs) have shown substantial success in various applications but remain vulnerable to adversarial attacks.<n>This study aims to identify and isolate the components of two different adversarial training techniques that contribute most to increased adversarial robustness.
arXiv Detail & Related papers (2025-08-25T13:33:38Z) - Equally Critical: Samples, Targets, and Their Mappings in Datasets [6.859656302020063]
This paper investigates how sample and target collectively influence training dynamic.<n>We first establish a taxonomy of existing paradigms through the lens of sample-target interactions.<n>We then propose a novel unified loss framework to assess their impact on training efficiency.
arXiv Detail & Related papers (2025-05-17T08:27:19Z) - Enhancing Adversarial Robustness via Uncertainty-Aware Distributional Adversarial Training [43.766504246864045]
We propose a novel uncertainty-aware distributional adversarial training method.
Our approach achieves state-of-the-art adversarial robustness and maintains natural performance.
arXiv Detail & Related papers (2024-11-05T07:26:24Z) - Towards Effective Evaluations and Comparisons for LLM Unlearning Methods [97.2995389188179]
This paper seeks to refine the evaluation of machine unlearning for large language models.<n>It addresses two key challenges -- the robustness of evaluation metrics and the trade-offs between competing goals.
arXiv Detail & Related papers (2024-06-13T14:41:00Z) - Preference Poisoning Attacks on Reward Model Learning [47.00395978031771]
We investigate the nature and extent of a vulnerability in learning reward models from pairwise comparisons.
We propose two classes of algorithmic approaches for these attacks: a gradient-based framework, and several variants of rank-by-distance methods.
We find that the best attacks are often highly successful, achieving in the most extreme case 100% success rate with only 0.3% of the data poisoned.
arXiv Detail & Related papers (2024-02-02T21:45:24Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Measuring Fairness Under Unawareness of Sensitive Attributes: A
Quantification-Based Approach [131.20444904674494]
We tackle the problem of measuring group fairness under unawareness of sensitive attributes.
We show that quantification approaches are particularly suited to tackle the fairness-under-unawareness problem.
arXiv Detail & Related papers (2021-09-17T13:45:46Z) - Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system.
Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model.
We evaluate our framework on a large-scale public-available skin lesion dataset.
arXiv Detail & Related papers (2021-03-07T03:10:32Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Benchmarking Adversarial Robustness [47.168521143464545]
We establish a comprehensive, rigorous, and coherent benchmark to evaluate adversarial robustness on image classification tasks.
Based on the evaluation results, we draw several important findings and provide insights for future research.
arXiv Detail & Related papers (2019-12-26T12:37:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.