Towards Mode Balancing of Generative Models via Diversity Weights
- URL: http://arxiv.org/abs/2304.11961v3
- Date: Thu, 15 Jun 2023 09:21:29 GMT
- Title: Towards Mode Balancing of Generative Models via Diversity Weights
- Authors: Sebastian Berns, Simon Colton, Christian Guckelsberger
- Abstract summary: We present diversity weights, a training scheme that increases a model's output diversity by balancing the modes in the training dataset.
We discuss connections of our approach to diversity, equity, and inclusion in generative machine learning more generally, and computational creativity specifically.
- Score: 1.2354076490479513
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large data-driven image models are extensively used to support creative and
artistic work. Under the currently predominant distribution-fitting paradigm, a
dataset is treated as ground truth to be approximated as closely as possible.
Yet, many creative applications demand a diverse range of output, and creators
often strive to actively diverge from a given data distribution. We argue that
an adjustment of modelling objectives, from pure mode coverage towards mode
balancing, is necessary to accommodate the goal of higher output diversity. We
present diversity weights, a training scheme that increases a model's output
diversity by balancing the modes in the training dataset. First experiments in
a controlled setting demonstrate the potential of our method. We discuss
connections of our approach to diversity, equity, and inclusion in generative
machine learning more generally, and computational creativity specifically. An
implementation of our algorithm is available at
https://github.com/sebastianberns/diversity-weights
Related papers
- Learning Multimodal Latent Generative Models with Energy-Based Prior [3.6648642834198797]
We propose a novel framework that integrates the latent generative model with the EBM.
This approach results in a more expressive and informative prior, better-capturing of information across multiple modalities.
arXiv Detail & Related papers (2024-09-30T01:38:26Z) - Diffusion Models For Multi-Modal Generative Modeling [32.61765315067488]
We propose a principled way to define a diffusion model by constructing a unified multi-modal diffusion model in a common diffusion space.
We propose several multimodal generation settings to verify our framework, including image transition, masked-image training, joint image-label and joint image-representation generative modeling.
arXiv Detail & Related papers (2024-07-24T18:04:17Z) - Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development [67.55944651679864]
We present a novel sandbox suite tailored for integrated data-model co-development.
This sandbox provides a comprehensive experimental platform, enabling rapid iteration and insight-driven refinement of both data and models.
We also uncover fruitful insights gleaned from exhaustive benchmarks, shedding light on the critical interplay between data quality, diversity, and model behavior.
arXiv Detail & Related papers (2024-07-16T14:40:07Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - Explore and Exploit the Diverse Knowledge in Model Zoo for Domain
Generalization [40.28810906825559]
We propose a new algorithm for integrating diverse pretrained models, not limited to the strongest models, in order to achieve enhanced out-of-distribution generalization performance.
Our proposed method demonstrates state-of-the-art empirical results on a variety of datasets, thus validating the benefits of utilizing diverse knowledge.
arXiv Detail & Related papers (2023-06-05T04:58:41Z) - Learning Sequential Latent Variable Models from Multimodal Time Series
Data [6.107812768939553]
We present a self-supervised generative modelling framework to jointly learn a probabilistic latent state representation of multimodal data.
We demonstrate that our approach leads to significant improvements in prediction and representation quality.
arXiv Detail & Related papers (2022-04-21T21:59:24Z) - Multimodal Adversarially Learned Inference with Factorized
Discriminators [10.818838437018682]
We propose a novel approach to generative modeling of multimodal data based on generative adversarial networks.
To learn a coherent multimodal generative model, we show that it is necessary to align different encoder distributions with the joint decoder distribution simultaneously.
By taking advantage of contrastive learning through factorizing the discriminator, we train our model on unimodal data.
arXiv Detail & Related papers (2021-12-20T08:18:49Z) - Trajectory-wise Multiple Choice Learning for Dynamics Generalization in
Reinforcement Learning [137.39196753245105]
We present a new model-based reinforcement learning algorithm that learns a multi-headed dynamics model for dynamics generalization.
We incorporate context learning, which encodes dynamics-specific information from past experiences into the context latent vector.
Our method exhibits superior zero-shot generalization performance across a variety of control tasks, compared to state-of-the-art RL methods.
arXiv Detail & Related papers (2020-10-26T03:20:42Z) - Conditional Generative Modeling via Learning the Latent Space [54.620761775441046]
We propose a novel framework for conditional generation in multimodal spaces.
It uses latent variables to model generalizable learning patterns.
At inference, the latent variables are optimized to find optimal solutions corresponding to multiple output modes.
arXiv Detail & Related papers (2020-10-07T03:11:34Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z) - Unsupervised multi-modal Styled Content Generation [61.040392094140245]
UMMGAN is a novel architecture designed to better model multi-modal distributions in an unsupervised fashion.
We show that UMMGAN effectively disentangles between modes and style, thereby providing an independent degree of control over the generated content.
arXiv Detail & Related papers (2020-01-10T19:36:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.