Be More Diverse than the Most Diverse: Online Selection of Diverse Mixtures of Generative Models
- URL: http://arxiv.org/abs/2412.17622v1
- Date: Mon, 23 Dec 2024 14:48:17 GMT
- Title: Be More Diverse than the Most Diverse: Online Selection of Diverse Mixtures of Generative Models
- Authors: Parham Rezaei, Farzan Farnia, Cheuk Ting Li,
- Abstract summary: In this work, we explore the selection of a mixture of multiple generative models.
We propose an online learning approach called Mixture Upper Confidence Bound (Mixture-UCB)
- Score: 33.04472814852163
- License:
- Abstract: The availability of multiple training algorithms and architectures for generative models requires a selection mechanism to form a single model over a group of well-trained generation models. The selection task is commonly addressed by identifying the model that maximizes an evaluation score based on the diversity and quality of the generated data. However, such a best-model identification approach overlooks the possibility that a mixture of available models can outperform each individual model. In this work, we explore the selection of a mixture of multiple generative models and formulate a quadratic optimization problem to find an optimal mixture model achieving the maximum of kernel-based evaluation scores including kernel inception distance (KID) and R\'{e}nyi kernel entropy (RKE). To identify the optimal mixture of the models using the fewest possible sample queries, we propose an online learning approach called Mixture Upper Confidence Bound (Mixture-UCB). Specifically, our proposed online learning method can be extended to every convex quadratic function of the mixture weights, for which we prove a concentration bound to enable the application of the UCB approach. We prove a regret bound for the proposed Mixture-UCB algorithm and perform several numerical experiments to show the success of the proposed Mixture-UCB method in finding the optimal mixture of text-based and image-based generative models. The codebase is available at https://github.com/Rezaei-Parham/Mixture-UCB .
Related papers
- Amortized Bayesian Mixture Models [1.3976439685325095]
This paper introduces a novel extension of Amortized Bayesian Inference (ABI) tailored to mixture models.
We factorize the posterior into a distribution of the parameters and a distribution of (categorical) mixture indicators, which allows us to use a combination of generative neural networks.
The proposed framework accommodates both independent and dependent mixture models, enabling filtering and smoothing.
arXiv Detail & Related papers (2025-01-17T14:51:03Z) - Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching.
We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy.
Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z) - Stabilizing black-box model selection with the inflated argmax [8.52745154080651]
We present a new approach to stabilizing model selection with theoretical stability guarantees.
Our method selects a small collection of models that all fit the data, and it is stable in that, with high probability, the removal of any training point will result in a collection of selected models that overlap with the original collection.
arXiv Detail & Related papers (2024-10-23T20:39:07Z) - An Online Learning Approach to Prompt-based Selection of Generative Models [23.91197677628145]
An online identification of the best generation model for various input prompts can reduce the costs associated with querying sub-optimal models.
We propose an online learning framework to predict the best data generation model for a given input prompt.
Our experiments on real and simulated text-to-image and image-to-text generative models show that RFF-UCB performs successfully in identifying the best generation model.
arXiv Detail & Related papers (2024-10-17T07:33:35Z) - Learning Energy-Based Models by Cooperative Diffusion Recovery Likelihood [64.95663299945171]
Training energy-based models (EBMs) on high-dimensional data can be both challenging and time-consuming.
There exists a noticeable gap in sample quality between EBMs and other generative frameworks like GANs and diffusion models.
We propose cooperative diffusion recovery likelihood (CDRL), an effective approach to tractably learn and sample from a series of EBMs.
arXiv Detail & Related papers (2023-09-10T22:05:24Z) - MILO: Model-Agnostic Subset Selection Framework for Efficient Model
Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training.
Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Model ensemble instead of prompt fusion: a sample-specific knowledge
transfer method for few-shot prompt tuning [85.55727213502402]
We focus on improving the few-shot performance of prompt tuning by transferring knowledge from soft prompts of source tasks.
We propose Sample-specific Ensemble of Source Models (SESoM)
SESoM learns to adjust the contribution of each source model for each target sample separately when ensembling source model outputs.
arXiv Detail & Related papers (2022-10-23T01:33:16Z) - A hybrid ensemble method with negative correlation learning for
regression [2.8484009470171943]
This study automatically selects and weights sub-models from a heterogeneous model pool.
It solves an optimization problem using an interior-point filtering linear-search algorithm.
The value of this study lies in its ease of use and effectiveness, allowing the hybrid ensemble to embrace diversity and accuracy.
arXiv Detail & Related papers (2021-04-06T06:45:14Z) - Semi-nonparametric Latent Class Choice Model with a Flexible Class
Membership Component: A Mixture Model Approach [6.509758931804479]
The proposed model formulates the latent classes using mixture models as an alternative approach to the traditional random utility specification.
Results show that mixture models improve the overall performance of latent class choice models.
arXiv Detail & Related papers (2020-07-06T13:19:26Z) - Stepwise Model Selection for Sequence Prediction via Deep Kernel
Learning [100.83444258562263]
We propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting.
In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions.
We are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose.
arXiv Detail & Related papers (2020-01-12T09:42:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.