Related papers: On the Evaluation of Generative Adversarial Networks By Discriminative Models

On the Evaluation of Generative Adversarial Networks By Discriminative Models

URL: http://arxiv.org/abs/2010.03549v1
Date: Wed, 7 Oct 2020 17:50:39 GMT
Title: On the Evaluation of Generative Adversarial Networks By Discriminative Models
Authors: Amirsina Torfi, Mohammadreza Beyki, Edward A. Fox
Abstract summary: Generative Adversarial Networks (GANs) can accurately model complex multi-dimensional data and generate realistic samples. The majority of research efforts associated with tackling this issue were validated by qualitative visual evaluation. In this work, we leverage Siamese neural networks to propose a domain-agnostic evaluation metric.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative Adversarial Networks (GANs) can accurately model complex multi-dimensional data and generate realistic samples. However, due to their implicit estimation of data distributions, their evaluation is a challenging task. The majority of research efforts associated with tackling this issue were validated by qualitative visual evaluation. Such approaches do not generalize well beyond the image domain. Since many of those evaluation metrics are proposed and bound to the vision domain, they are difficult to apply to other domains. Quantitative measures are necessary to better guide the training and comparison of different GANs models. In this work, we leverage Siamese neural networks to propose a domain-agnostic evaluation metric: (1) with a qualitative evaluation that is consistent with human evaluation, (2) that is robust relative to common GAN issues such as mode dropping and invention, and (3) does not require any pretrained classifier. The empirical results in this paper demonstrate the superiority of this method compared to the popular Inception Score and are competitive with the FID score.

Related papers

Rethinking the generalization of drug target affinity prediction algorithms via similarity aware evaluation [19.145735532822012]
We show that the canonical randomized split of a test set in conventional evaluation leaves the test set dominated by samples with high similarity to the training set. We propose a framework of similarity aware evaluation in which a novel split methodology is proposed to adapt to any desired distribution. Results demonstrate that the proposed split methodology can significantly better fit desired distributions and guide the development of models.
arXiv Detail & Related papers (2025-04-13T08:30:57Z)
A Meaningful Perturbation Metric for Evaluating Explainability Methods [55.09730499143998]
We introduce a novel approach, which harnesses image generation models to perform targeted perturbation. Specifically, we focus on inpainting only the high-relevance pixels of an input image to modify the model's predictions while preserving image fidelity. This is in contrast to existing approaches, which often produce out-of-distribution modifications, leading to unreliable results.
arXiv Detail & Related papers (2025-04-09T11:46:41Z)
SEOE: A Scalable and Reliable Semantic Evaluation Framework for Open Domain Event Detection [70.23196257213829]
We propose a scalable and reliable Semantic-level Evaluation framework for Open domain Event detection. Our proposed framework first constructs a scalable evaluation benchmark that currently includes 564 event types covering 7 major domains. We then leverage large language models (LLMs) as automatic evaluation agents to compute a semantic F1-score, incorporating fine-grained definitions of semantically similar labels.
arXiv Detail & Related papers (2025-03-05T09:37:05Z)
Evaluating Deep Neural Networks in Deployment (A Comparative and Replicability Study) [11.242083685224554]
Deep neural networks (DNNs) are increasingly used in safety-critical applications. We study recent approaches that have been proposed to evaluate the reliability of DNNs in deployment. We find that it is hard to run and reproduce the results for these approaches on their replication packages and even more difficult to run them on artifacts other than their own.
arXiv Detail & Related papers (2024-07-11T17:58:12Z)
Towards Evaluating Transfer-based Attacks Systematically, Practically, and Fairly [79.07074710460012]
adversarial vulnerability of deep neural networks (DNNs) has drawn great attention. An increasing number of transfer-based methods have been developed to fool black-box DNN models. We establish a transfer-based attack benchmark (TA-Bench) which implements 30+ methods.
arXiv Detail & Related papers (2023-11-02T15:35:58Z)
Activate and Reject: Towards Safe Domain Generalization under Category Shift [71.95548187205736]
We study a practical problem of Domain Generalization under Category Shift (DGCS) It aims to simultaneously detect unknown-class samples and classify known-class samples in the target domains. Compared to prior DG works, we face two new challenges: 1) how to learn the concept of unknown'' during training with only source known-class samples, and 2) how to adapt the source-trained model to unseen environments.
arXiv Detail & Related papers (2023-10-07T07:53:12Z)
GMValuator: Similarity-based Data Valuation for Generative Models [41.76259565672285]
We introduce Generative Model Valuator (GMValuator), the first training-free and model-agnostic approach to provide data valuation for generation tasks. GMValuator is extensively evaluated on various datasets and generative architectures to demonstrate its effectiveness.
arXiv Detail & Related papers (2023-04-21T02:02:02Z)
GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models [60.48306899271866]
We present a new framework, called GREAT Score, for global robustness evaluation of adversarial perturbation using generative models. We show high correlation and significantly reduced cost of GREAT Score when compared to the attack-based model ranking on RobustBench. GREAT Score can be used for remote auditing of privacy-sensitive black-box models.
arXiv Detail & Related papers (2023-04-19T14:58:27Z)
Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models. In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z)
Generalizability of Adversarial Robustness Under Distribution Shifts [57.767152566761304]
We take a first step towards investigating the interplay between empirical and certified adversarial robustness on one hand and domain generalization on another. We train robust models on multiple domains and evaluate their accuracy and robustness on an unseen domain. We extend our study to cover a real-world medical application, in which adversarial augmentation significantly boosts the generalization of robustness with minimal effect on clean data accuracy.
arXiv Detail & Related papers (2022-09-29T18:25:48Z)
Towards GAN Benchmarks Which Require Generalization [48.075521136623564]
We argue that estimating the function must require a large sample from the model. We turn to neural network divergences (NNDs) which are defined in terms of a neural network trained to distinguish between distributions. The resulting benchmarks cannot be "won" by training set memorization, while still being perceptually correlated and computable only from samples.
arXiv Detail & Related papers (2020-01-10T20:18:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.