Related papers: A Survey of Federated Evaluation in Federated Learning

Related papers

Existing Large Language Model Unlearning Evaluations Are Inconclusive [105.55899615056573]
We show that some evaluations introduce substantial new information into the model, potentially masking true unlearning performance.<n>We demonstrate that evaluation outcomes vary significantly across tasks, undermining the generalizability of current evaluation routines.<n>We propose two principles for future unlearning evaluations: minimal information injection and downstream task awareness.
arXiv Detail & Related papers (2025-05-31T19:43:00Z)
ATR-Bench: A Federated Learning Benchmark for Adaptation, Trust, and Reasoning [21.099779419619345]
We introduce a unified framework for analyzing federated learning through three foundational dimensions: Adaptation, Trust, and Reasoning.<n>ATR-Bench lays the groundwork for a systematic and holistic evaluation of federated learning with real-world relevance.
arXiv Detail & Related papers (2025-05-22T16:11:38Z)
Federated Testing (FedTest): A New Scheme to Enhance Convergence and Mitigate Adversarial Attacks in Federating Learning [35.14491996649841]
We introduce a novel federated learning framework, which we call federated testing for federated learning (FedTest) In FedTest, the local data of a specific user is used to train the model of that user and test the models of the other users. Our numerical results reveal that the proposed method not only accelerates convergence rates but also diminishes the potential influence of malicious users.
arXiv Detail & Related papers (2025-01-19T21:01:13Z)
FEDLAD: Federated Evaluation of Deep Leakage Attacks and Defenses [50.921333548391345]
Federated Learning is a privacy preserving decentralized machine learning paradigm. Recent research has revealed that private ground truth data can be recovered through a gradient technique known as Deep Leakage. This paper introduces the FEDLAD Framework (Federated Evaluation of Deep Leakage Attacks and Defenses), a comprehensive benchmark for evaluating Deep Leakage attacks and defenses.
arXiv Detail & Related papers (2024-11-05T11:42:26Z)
FedCert: Federated Accuracy Certification [8.34167718121698]
Federated Learning (FL) has emerged as a powerful paradigm for training machine learning models in a decentralized manner. Previous studies have assessed the effectiveness of models in centralized training based on certified accuracy. This study proposes a method named FedCert to take the first step toward evaluating the robustness of FL systems.
arXiv Detail & Related papers (2024-10-04T01:19:09Z)
F-Eval: Assessing Fundamental Abilities with Refined Evaluation Methods [102.98899881389211]
We propose F-Eval, a bilingual evaluation benchmark to evaluate the fundamental abilities, including expression, commonsense and logic. For reference-free subjective tasks, we devise new evaluation methods, serving as alternatives to scoring by API models.
arXiv Detail & Related papers (2024-01-26T13:55:32Z)
Data Valuation and Detections in Federated Learning [4.899818550820576]
Federated Learning (FL) enables collaborative model training while preserving the privacy of raw data. A challenge in this framework is the fair and efficient valuation of data, which is crucial for incentivizing clients to contribute high-quality data in the FL task. This paper introduces a novel privacy-preserving method for evaluating client contributions and selecting relevant datasets without a pre-specified training algorithm in an FL task.
arXiv Detail & Related papers (2023-11-09T12:01:32Z)
A Comprehensive Study on Model Initialization Techniques Ensuring Efficient Federated Learning [0.0]
Federated learning(FL) has emerged as a promising paradigm for training machine learning models in a distributed and privacy-preserving manner. The choice of methods used for models plays a crucial role in the performance, convergence speed, communication efficiency, privacy guarantees of federated learning systems. Our research meticulously compares, categorizes, and delineates the merits and demerits of each technique, examining their applicability across diverse FL scenarios.
arXiv Detail & Related papers (2023-10-31T23:26:58Z)
Exploring Federated Unlearning: Analysis, Comparison, and Insights [101.64910079905566]
federated unlearning enables the selective removal of data from models trained in federated systems. This paper examines existing federated unlearning approaches, examining their algorithmic efficiency, impact on model accuracy, and effectiveness in preserving privacy. We propose the OpenFederatedUnlearning framework, a unified benchmark for evaluating federated unlearning methods.
arXiv Detail & Related papers (2023-10-30T01:34:33Z)
A Survey for Federated Learning Evaluations: Goals and Measures [26.120949005265345]
Federated learning (FL) is a novel paradigm for privacy-preserving machine learning. evaluating FL is challenging due to its interdisciplinary nature and diverse goals, such as utility, efficiency, and security. We introduce FedEval, an open-source platform that provides a standardized and comprehensive evaluation framework for FL algorithms.
arXiv Detail & Related papers (2023-08-23T00:17:51Z)
A Call to Reflect on Evaluation Practices for Failure Detection in Image Classification [0.491574468325115]
We present a large-scale empirical study for the first time enabling benchmarking confidence scoring functions. The revelation of a simple softmax response baseline as the overall best performing method underlines the drastic shortcomings of current evaluation.
arXiv Detail & Related papers (2022-11-28T12:25:27Z)
Towards Fair Federated Learning with Zero-Shot Data Augmentation [123.37082242750866]
Federated learning has emerged as an important distributed learning paradigm, where a server aggregates a global model from many client-trained models while having no access to the client data. We propose a novel federated learning system that employs zero-shot data augmentation on under-represented data to mitigate statistical heterogeneity and encourage more uniform accuracy performance across clients in federated networks. We study two variants of this scheme, Fed-ZDAC (federated learning with zero-shot data augmentation at the clients) and Fed-ZDAS (federated learning with zero-shot data augmentation at the server).
arXiv Detail & Related papers (2021-04-27T18:23:54Z)
Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach [84.02388020258141]
We propose a new framework named ENIGMA for estimating human evaluation scores based on off-policy evaluation in reinforcement learning. ENIGMA only requires a handful of pre-collected experience data, and therefore does not involve human interaction with the target policy during the evaluation. Our experiments show that ENIGMA significantly outperforms existing methods in terms of correlation with human evaluation scores.
arXiv Detail & Related papers (2021-02-20T03:29:20Z)
Robustness Gym: Unifying the NLP Evaluation Landscape [91.80175115162218]
Deep neural networks are often brittle when deployed in real-world systems. Recent research has focused on testing the robustness of such models. We propose a solution in the form of Robustness Gym, a simple and evaluation toolkit.
arXiv Detail & Related papers (2021-01-13T02:37:54Z)
Toward Understanding the Influence of Individual Clients in Federated Learning [52.07734799278535]
Federated learning allows clients to jointly train a global model without sending their private data to a central server. We defined a new notion called em-Influence, quantify this influence over parameters, and proposed an effective efficient model to estimate this metric.
arXiv Detail & Related papers (2020-12-20T14:34:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.