Towards GAN Benchmarks Which Require Generalization
- URL: http://arxiv.org/abs/2001.03653v1
- Date: Fri, 10 Jan 2020 20:18:47 GMT
- Title: Towards GAN Benchmarks Which Require Generalization
- Authors: Ishaan Gulrajani, Colin Raffel, Luke Metz
- Abstract summary: We argue that estimating the function must require a large sample from the model.
We turn to neural network divergences (NNDs) which are defined in terms of a neural network trained to distinguish between distributions.
The resulting benchmarks cannot be "won" by training set memorization, while still being perceptually correlated and computable only from samples.
- Score: 48.075521136623564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For many evaluation metrics commonly used as benchmarks for unconditional
image generation, trivially memorizing the training set attains a better score
than models which are considered state-of-the-art; we consider this
problematic. We clarify a necessary condition for an evaluation metric not to
behave this way: estimating the function must require a large sample from the
model. In search of such a metric, we turn to neural network divergences
(NNDs), which are defined in terms of a neural network trained to distinguish
between distributions. The resulting benchmarks cannot be "won" by training set
memorization, while still being perceptually correlated and computable only
from samples. We survey past work on using NNDs for evaluation and implement an
example black-box metric based on these ideas. Through experimental validation
we show that it can effectively measure diversity, sample quality, and
generalization.
Related papers
- Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Feature Likelihood Divergence: Evaluating the Generalization of
Generative Models Using Samples [25.657798631897908]
Feature Likelihood Divergence provides a comprehensive trichotomic evaluation of generative models.
We empirically demonstrate the ability of FLD to identify overfitting problem cases, even when previously proposed metrics fail.
arXiv Detail & Related papers (2023-02-09T04:57:27Z) - A Statistical Model for Predicting Generalization in Few-Shot
Classification [6.158812834002346]
We introduce a Gaussian model of the feature distribution to predict the generalization error.
We show that our approach outperforms alternatives such as the leave-one-out cross-validation strategy.
arXiv Detail & Related papers (2022-12-13T10:21:15Z) - Intersection of Parallels as an Early Stopping Criterion [64.8387564654474]
We propose a method to spot an early stopping point in the training iterations without the need for a validation set.
For a wide range of learning rates, our method, called Cosine-Distance Criterion (CDC), leads to better generalization on average than all the methods that we compare against.
arXiv Detail & Related papers (2022-08-19T19:42:41Z) - A Study on the Evaluation of Generative Models [19.18642459565609]
Implicit generative models, which do not return likelihood values, have become prevalent in recent years.
In this work, we study the evaluation metrics of generative models by generating a high-quality synthetic dataset.
Our study shows that while FID and IS do correlate to several f-divergences, their ranking of close models can vary considerably.
arXiv Detail & Related papers (2022-06-22T09:27:31Z) - Compare learning: bi-attention network for few-shot learning [6.559037166322981]
One of the Few-shot learning methods called metric learning addresses this challenge by first learning a deep distance metric to determine whether a pair of images belong to the same category.
In this paper, we propose a novel approach named Bi-attention network to compare the instances, which can measure the similarity between embeddings of instances precisely, globally and efficiently.
arXiv Detail & Related papers (2022-03-25T07:39:10Z) - An Empirical Comparison of Instance Attribution Methods for NLP [62.63504976810927]
We evaluate the degree to which different potential instance attribution agree with respect to the importance of training samples.
We find that simple retrieval methods yield training instances that differ from those identified via gradient-based methods.
arXiv Detail & Related papers (2021-04-09T01:03:17Z) - ATRM: Attention-based Task-level Relation Module for GNN-based Few-shot
Learning [14.464964336101028]
We propose a new relation measure method, namely the attention-based task-level relation module (ATRM)
The proposed module captures the relation representations between nodes by considering the sample-to-task instead of sample-to-sample embedding features.
Experimental results demonstrate that the proposed module is effective for GNN-based few-shot learning.
arXiv Detail & Related papers (2021-01-25T00:53:04Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.