Be More Diverse than the Most Diverse: Optimal Mixtures of Generative Models via Mixture-UCB Bandit Algorithms
- URL: http://arxiv.org/abs/2412.17622v2
- Date: Sat, 22 Mar 2025 10:45:56 GMT
- Title: Be More Diverse than the Most Diverse: Optimal Mixtures of Generative Models via Mixture-UCB Bandit Algorithms
- Authors: Parham Rezaei, Farzan Farnia, Cheuk Ting Li,
- Abstract summary: We numerically show that a mixture of generative models on benchmark image datasets can indeed achieve a better evaluation score.<n>We propose the Mixture Upper Confidence Bound (Mixture-UCB) algorithm that provably converges to the optimal mixture of the involved models.
- Score: 33.04472814852163
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The availability of multiple training algorithms and architectures for generative models requires a selection mechanism to form a single model over a group of well-trained generation models. The selection task is commonly addressed by identifying the model that maximizes an evaluation score based on the diversity and quality of the generated data. However, such a best-model identification approach overlooks the possibility that a mixture of available models can outperform each individual model. In this work, we numerically show that a mixture of generative models on benchmark image datasets can indeed achieve a better evaluation score (based on FID and KID scores), compared to the individual models. This observation motivates the development of efficient algorithms for selecting the optimal mixture of the models. To address this, we formulate a quadratic optimization problem to find an optimal mixture model achieving the maximum of kernel-based evaluation scores including kernel inception distance (KID) and R\'enyi kernel entropy (RKE). To identify the optimal mixture of the models using the fewest possible sample queries, we view the selection task as a multi-armed bandit (MAB) problem and propose the Mixture Upper Confidence Bound (Mixture-UCB) algorithm that provably converges to the optimal mixture of the involved models. More broadly, the proposed Mixture-UCB can be extended to optimize every convex quadratic function of the mixture weights in a general MAB setting. We prove a regret bound for the Mixture-UCB algorithm and perform several numerical experiments to show the success of Mixture-UCB in finding the optimal mixture of text and image generative models. The project code is available at https://github.com/Rezaei-Parham/Mixture-UCB.
Related papers
- Amortized Bayesian Mixture Models [1.3976439685325095]
This paper introduces a novel extension of Amortized Bayesian Inference (ABI) tailored to mixture models.
We factorize the posterior into a distribution of the parameters and a distribution of (categorical) mixture indicators, which allows us to use a combination of generative neural networks.
The proposed framework accommodates both independent and dependent mixture models, enabling filtering and smoothing.
arXiv Detail & Related papers (2025-01-17T14:51:03Z) - Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching.<n>We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy.<n>Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z) - Stabilizing black-box model selection with the inflated argmax [8.52745154080651]
This paper presents a new approach to stabilizing model selection that leverages a combination of bagging and an "inflated" argmax operation.
Our method selects a small collection of models that all fit the data, and it is stable in that, with high probability, the removal of any training point will result in a collection of selected models that overlaps with the original collection.
In both settings, the proposed method yields stable and compact collections of selected models, outperforming a variety of benchmarks.
arXiv Detail & Related papers (2024-10-23T20:39:07Z) - A Multi-Armed Bandit Approach to Online Selection and Evaluation of Generative Models [23.91197677628145]
In this work, we propose an online evaluation and selection framework to find the generative model that maximizes a standard assessment score.
Specifically, we develop the MAB-based selection of generative models considering the Fr'echet Distance (FD) and Inception Score (IS) metrics.
Our empirical results suggest the efficacy of MAB approaches for the sample-efficient evaluation and selection of deep generative models.
arXiv Detail & Related papers (2024-06-11T16:57:48Z) - MILO: Model-Agnostic Subset Selection Framework for Efficient Model
Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training.
Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z) - Model ensemble instead of prompt fusion: a sample-specific knowledge
transfer method for few-shot prompt tuning [85.55727213502402]
We focus on improving the few-shot performance of prompt tuning by transferring knowledge from soft prompts of source tasks.
We propose Sample-specific Ensemble of Source Models (SESoM)
SESoM learns to adjust the contribution of each source model for each target sample separately when ensembling source model outputs.
arXiv Detail & Related papers (2022-10-23T01:33:16Z) - A Bagging and Boosting Based Convexly Combined Optimum Mixture
Probabilistic Model [0.0]
A bagging and boosting based convexly combined mixture probabilistic model has been suggested.
This model is a result of iteratively searching for obtaining the optimum probabilistic model that provides the maximum p value.
arXiv Detail & Related papers (2021-06-08T04:20:00Z) - A hybrid ensemble method with negative correlation learning for
regression [2.8484009470171943]
This study automatically selects and weights sub-models from a heterogeneous model pool.
It solves an optimization problem using an interior-point filtering linear-search algorithm.
The value of this study lies in its ease of use and effectiveness, allowing the hybrid ensemble to embrace diversity and accuracy.
arXiv Detail & Related papers (2021-04-06T06:45:14Z) - Community Detection in the Stochastic Block Model by Mixed Integer
Programming [3.8073142980733]
Degree-Corrected Block Model (DCSBM) is a popular model to generate random graphs with community structure given an expected degree sequence.
Standard approach of community detection based on the DCSBM is to search for the model parameters that are the most likely to have produced the observed network data through maximum likelihood estimation (MLE)
We present mathematical programming formulations and exact solution methods that can provably find the model parameters and community assignments of maximum likelihood given an observed graph.
arXiv Detail & Related papers (2021-01-26T22:04:40Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Semi-nonparametric Latent Class Choice Model with a Flexible Class
Membership Component: A Mixture Model Approach [6.509758931804479]
The proposed model formulates the latent classes using mixture models as an alternative approach to the traditional random utility specification.
Results show that mixture models improve the overall performance of latent class choice models.
arXiv Detail & Related papers (2020-07-06T13:19:26Z) - Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates.
The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature.
It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z) - Expected Information Maximization: Using the I-Projection for Mixture
Density Estimation [22.096148237257644]
Modelling highly multi-modal data is a challenging problem in machine learning.
We present a new algorithm called Expected Information Maximization (EIM) for computing the I-projection.
We show that our algorithm is much more effective in computing the I-projection than recent GAN approaches.
arXiv Detail & Related papers (2020-01-23T17:24:50Z) - Stepwise Model Selection for Sequence Prediction via Deep Kernel
Learning [100.83444258562263]
We propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting.
In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions.
We are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose.
arXiv Detail & Related papers (2020-01-12T09:42:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.