On the Distributed Evaluation of Generative Models
- URL: http://arxiv.org/abs/2310.11714v4
- Date: Tue, 11 Jun 2024 07:33:04 GMT
- Title: On the Distributed Evaluation of Generative Models
- Authors: Zixiao Wang, Farzan Farnia, Zhenghao Lin, Yunheng Shen, Bei Yu,
- Abstract summary: We focus on the widely-used distance-based evaluation metrics, Fr'echet Inception Distance (FID) and Kernel Inception Distance (KID)
In the case of KID metric, we prove that scoring a group of generative models using the clients' averaged KID score will result in the same ranking as that of a centralized KID evaluation over a collective reference set containing all the clients' data.
We provide examples in which two generative models are assigned the same FID score by each client in a distributed setting, while the centralized FID scores of the two models are significantly different.
- Score: 15.629121946912088
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The evaluation of deep generative models has been extensively studied in the centralized setting, where the reference data are drawn from a single probability distribution. On the other hand, several applications of generative models concern distributed settings, e.g. the federated learning setting, where the reference data for conducting evaluation are provided by several clients in a network. In this paper, we study the evaluation of generative models in such distributed contexts with potentially heterogeneous data distributions across clients. We focus on the widely-used distance-based evaluation metrics, Fr\'echet Inception Distance (FID) and Kernel Inception Distance (KID). In the case of KID metric, we prove that scoring a group of generative models using the clients' averaged KID score will result in the same ranking as that of a centralized KID evaluation over a collective reference set containing all the clients' data. In contrast, we show the same result does not apply to the FID-based evaluation. We provide examples in which two generative models are assigned the same FID score by each client in a distributed setting, while the centralized FID scores of the two models are significantly different. We perform several numerical experiments on standard image datasets and generative models to support our theoretical results on the distributed evaluation of generative models using FID and KID scores.
Related papers
- An Optimism-based Approach to Online Evaluation of Generative Models [23.91197677628145]
We propose an online evaluation framework to find the generative model that maximizes a standard assessment score among a group of available models.
Specifically, we study the online assessment of generative models based on the Fr'echet Inception Distance (FID) and Inception Score (IS) metrics.
arXiv Detail & Related papers (2024-06-11T16:57:48Z) - GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models [60.48306899271866]
We present a new framework, called GREAT Score, for global robustness evaluation of adversarial perturbation using generative models.
We show high correlation and significantly reduced cost of GREAT Score when compared to the attack-based model ranking on RobustBench.
GREAT Score can be used for remote auditing of privacy-sensitive black-box models.
arXiv Detail & Related papers (2023-04-19T14:58:27Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge.
This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework.
We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z) - Statistical Model Criticism of Variational Auto-Encoders [15.005894753472894]
We propose a framework for the statistical evaluation of variational auto-encoders (VAEs)
We test two instances of this framework in the context of modelling images of handwritten digits and a corpus of English text.
arXiv Detail & Related papers (2022-04-06T18:19:29Z) - A Personalized Federated Learning Algorithm: an Application in Anomaly
Detection [0.6700873164609007]
Federated Learning (FL) has recently emerged as a promising method to overcome data privacy and transmission issues.
In FL, datasets collected from different devices or sensors are used to train local models (clients) each of which shares its learning with a centralized model (server)
This paper proposes a novel Personalized FedAvg (PC-FedAvg) which aims to control weights communication and aggregation augmented with a tailored learning algorithm to personalize the resulting models at each client.
arXiv Detail & Related papers (2021-11-04T04:57:11Z) - Implicit Model Specialization through DAG-based Decentralized Federated
Learning [0.0]
Federated learning allows a group of distributed clients to train a common machine learning model on private data.
We propose a unified approach to decentralization and personalization in federated learning.
Our evaluation shows that the specialization of models emerges directly from the DAG-based communication of model updates.
arXiv Detail & Related papers (2021-11-01T20:55:47Z) - Decentralised Person Re-Identification with Selective Knowledge
Aggregation [56.40855978874077]
Existing person re-identification (Re-ID) methods mostly follow a centralised learning paradigm which shares all training data to a collection for model learning.
Two recent works have introduced decentralised (federated) Re-ID learning for constructing a globally generalised model (server)
However, these methods are poor on how to adapt the generalised model to maximise its performance on individual client domain Re-ID tasks.
We present a new Selective Knowledge Aggregation approach to decentralised person Re-ID to optimise the trade-off between model personalisation and generalisation.
arXiv Detail & Related papers (2021-10-21T18:09:53Z) - How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating
and Auditing Generative Models [95.8037674226622]
We introduce a 3-dimensional evaluation metric that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion.
Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity.
arXiv Detail & Related papers (2021-02-17T18:25:30Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.