Related papers: Compositional Generative Modeling: A Single Model is Not All You Need

Compositional Generative Modeling: A Single Model is Not All You Need

URL: http://arxiv.org/abs/2402.01103v3
Date: Mon, 3 Jun 2024 23:30:33 GMT
Title: Compositional Generative Modeling: A Single Model is Not All You Need
Authors: Yilun Du, Leslie Kaelbling,
Abstract summary: We argue that we should instead construct large generative systems by composing smaller generative models together. We show how such a compositional generative approach enables us to learn distributions in a more data-efficient manner.
Score: 29.050431676226115
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large monolithic generative models trained on massive amounts of data have become an increasingly dominant approach in AI research. In this paper, we argue that we should instead construct large generative systems by composing smaller generative models together. We show how such a compositional generative approach enables us to learn distributions in a more data-efficient manner, enabling generalization to parts of the data distribution unseen at training time. We further show how this enables us to program and construct new generative models for tasks completely unseen at training. Finally, we show that in many cases, we can discover separate compositional components from data.

Related papers

Can Generative Artificial Intelligence Survive Data Contamination? Theoretical Guarantees under Contaminated Recursive Training [11.253812961752958]
Generative Artificial Intelligence (AI) has become a transformative force across science, industry, and society.<n>As these systems grow in popularity, web data becomes increasingly interwoven with this AI-generated material.<n>As generative models are updated regularly, later models will inevitably be trained on mixtures of human-generated data and AI-generated data from earlier versions.
arXiv Detail & Related papers (2026-02-17T22:38:18Z)
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains [114.76612918465948]
Large language models (LLMs) have achieved remarkable performance in recent years but are fundamentally limited by the underlying training data. We propose a complementary approach towards self-improvement where finetuning is applied to a multiagent society of language models.
arXiv Detail & Related papers (2025-01-10T04:35:46Z)
Fitting Multiple Machine Learning Models with Performance Based Clustering [8.763425474439552]
Traditional machine learning approaches assume that data comes from a single generating mechanism, which may not hold for most real life data. We introduce a clustering framework that eliminates this assumption by grouping the data according to the relations between the features and the target values. We extend our framework to applications having streaming data where we produce outcomes using an ensemble of models.
arXiv Detail & Related papers (2024-11-10T19:38:35Z)
Knowledge Fusion By Evolving Weights of Language Models [5.354527640064584]
This paper examines the approach of integrating multiple models into a unified model. We propose a knowledge fusion method named Evolver, inspired by evolutionary algorithms.
arXiv Detail & Related papers (2024-06-18T02:12:34Z)
Heat Death of Generative Models in Closed-Loop Learning [63.83608300361159]
We study the learning dynamics of generative models that are fed back their own produced content in addition to their original training dataset. We show that, unless a sufficient amount of external data is introduced at each iteration, any non-trivial temperature leads the model to degenerate.
arXiv Detail & Related papers (2024-04-02T21:51:39Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models. This creates a barrier to fusing knowledge across individual models to yield a better single model. We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data. Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z)
Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis. We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z)
Hierarchical Few-Shot Generative Models [18.216729811514718]
We study a latent variables approach that extends the Neural Statistician to a fully hierarchical approach with an attention-based point to set-level aggregation. Our results show that the hierarchical formulation better captures the intrinsic variability within the sets in the small data regime.
arXiv Detail & Related papers (2021-10-23T19:19:39Z)
Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Models [86.9292779620645]
We develop a contrastive framework for generative model learning, allowing us to train the model not just by the commonality between modalities, but by the distinction between "related" and "unrelated" multimodal data. Under our proposed framework, the generative model can accurately identify related samples from unrelated ones, making it possible to make use of the plentiful unlabeled, unpaired multimodal data.
arXiv Detail & Related papers (2020-07-02T15:08:11Z)
Unsupervised multi-modal Styled Content Generation [61.040392094140245]
UMMGAN is a novel architecture designed to better model multi-modal distributions in an unsupervised fashion. We show that UMMGAN effectively disentangles between modes and style, thereby providing an independent degree of control over the generated content.
arXiv Detail & Related papers (2020-01-10T19:36:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.