Diversity vs. Recognizability: Human-like generalization in one-shot
generative models
- URL: http://arxiv.org/abs/2205.10370v1
- Date: Fri, 20 May 2022 13:17:08 GMT
- Title: Diversity vs. Recognizability: Human-like generalization in one-shot
generative models
- Authors: Victor Boutin, Lakshya Singhal, Xavier Thomas and Thomas Serre
- Abstract summary: We propose a new framework to evaluate one-shot generative models along two axes: sample recognizability vs. diversity.
We first show that GAN-like and VAE-like models fall on opposite ends of the diversity-recognizability space.
In contrast, disentanglement transports the model along a parabolic curve that could be used to maximize recognizability.
- Score: 5.964436882344729
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust generalization to new concepts has long remained a distinctive feature
of human intelligence. However, recent progress in deep generative models has
now led to neural architectures capable of synthesizing novel instances of
unknown visual concepts from a single training example. Yet, a more precise
comparison between these models and humans is not possible because existing
performance metrics for generative models (i.e., FID, IS, likelihood) are not
appropriate for the one-shot generation scenario. Here, we propose a new
framework to evaluate one-shot generative models along two axes: sample
recognizability vs. diversity (i.e., intra-class variability). Using this
framework, we perform a systematic evaluation of representative one-shot
generative models on the Omniglot handwritten dataset. We first show that
GAN-like and VAE-like models fall on opposite ends of the
diversity-recognizability space. Extensive analyses of the effect of key model
parameters further revealed that spatial attention and context integration have
a linear contribution to the diversity-recognizability trade-off. In contrast,
disentanglement transports the model along a parabolic curve that could be used
to maximize recognizability. Using the diversity-recognizability framework, we
were able to identify models and parameters that closely approximate human
data.
Related papers
- Embedding-based statistical inference on generative models [10.948308354932639]
We extend results related to embedding-based representations of generative models to classical statistical inference settings.
We demonstrate that using the perspective space as the basis of a notion of "similar" is effective for multiple model-level inference tasks.
arXiv Detail & Related papers (2024-10-01T22:28:39Z) - Bayesian Inverse Graphics for Few-Shot Concept Learning [3.475273727432576]
We present a Bayesian model of perception that learns using only minimal data.
We show how this representation can be used for downstream tasks such as few-shot classification and estimation.
arXiv Detail & Related papers (2024-09-12T18:30:41Z) - Corpus Considerations for Annotator Modeling and Scaling [9.263562546969695]
We show that the commonly used user token model consistently outperforms more complex models.
Our findings shed light on the relationship between corpus statistics and annotator modeling performance.
arXiv Detail & Related papers (2024-04-02T22:27:24Z) - Indeterminacy in Latent Variable Models: Characterization and Strong
Identifiability [3.959606869996233]
We construct a theoretical framework for analyzing the indeterminacies of latent variable models.
We then investigate how we might specify strongly identifiable latent variable models.
arXiv Detail & Related papers (2022-06-02T00:01:27Z) - STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data.
Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z) - How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating
and Auditing Generative Models [95.8037674226622]
We introduce a 3-dimensional evaluation metric that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion.
Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity.
arXiv Detail & Related papers (2021-02-17T18:25:30Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - AvgOut: A Simple Output-Probability Measure to Eliminate Dull Responses [97.50616524350123]
We build dialogue models that are dynamically aware of what utterances or tokens are dull without any feature-engineering.
The first model, MinAvgOut, directly maximizes the diversity score through the output distributions of each batch.
The second model, Label Fine-Tuning (LFT), prepends to the source sequence a label continuously scaled by the diversity score to control the diversity level.
The third model, RL, adopts Reinforcement Learning and treats the diversity score as a reward signal.
arXiv Detail & Related papers (2020-01-15T18:32:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.