Related papers: An Online Learning Approach to Prompt-based Selection of Generative Models

An Online Learning Approach to Prompt-based Selection of Generative Models

URL: http://arxiv.org/abs/2410.13287v1
Date: Thu, 17 Oct 2024 07:33:35 GMT
Title: An Online Learning Approach to Prompt-based Selection of Generative Models
Authors: Xiaoyan Hu, Ho-fung Leung, Farzan Farnia,
Abstract summary: An online identification of the best generation model for various input prompts can reduce the costs associated with querying sub-optimal models. We propose an online learning framework to predict the best data generation model for a given input prompt. Our experiments on real and simulated text-to-image and image-to-text generative models show RFF-UCB performs successfully in identifying the best generation model.
Score: 23.91197677628145
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Selecting a sample generation scheme from multiple text-based generative models is typically addressed by choosing the model that maximizes an averaged evaluation score. However, this score-based selection overlooks the possibility that different models achieve the best generation performance for different types of text prompts. An online identification of the best generation model for various input prompts can reduce the costs associated with querying sub-optimal models. In this work, we explore the possibility of varying rankings of text-based generative models for different text prompts and propose an online learning framework to predict the best data generation model for a given input prompt. The proposed framework adapts the kernelized contextual bandit (CB) methodology to a CB setting with shared context variables across arms, utilizing the generated data to update a kernel-based function that predicts which model will achieve the highest score for unseen text prompts. Additionally, we apply random Fourier features (RFF) to the kernelized CB algorithm to accelerate the online learning process and establish a $\widetilde{\mathcal{O}}(\sqrt{T})$ regret bound for the proposed RFF-based CB algorithm over T iterations. Our numerical experiments on real and simulated text-to-image and image-to-text generative models show RFF-UCB performs successfully in identifying the best generation model across different sample types.

Related papers

Variational Prefix Tuning for Diverse and Accurate Code Summarization Using Pre-trained Language Models [3.06414751922655]
Variational Prefix Tuning (VPT) is a novel approach that enhances pre-trained models' ability to generate diverse yet accurate sets of summaries.<n>Our method integrates a Conditional Variational Autoencoder (CVAE) framework as a modular component into pre-trained models.
arXiv Detail & Related papers (2025-05-14T01:46:56Z)
Be More Diverse than the Most Diverse: Optimal Mixtures of Generative Models via Mixture-UCB Bandit Algorithms [33.04472814852163]
We numerically show that a mixture of generative models on benchmark image datasets can indeed achieve a better evaluation score. We propose the Mixture Upper Confidence Bound (Mixture-UCB) algorithm that provably converges to the optimal mixture of the involved models.
arXiv Detail & Related papers (2024-12-23T14:48:17Z)
Conditional Vendi Score: An Information-Theoretic Approach to Diversity Evaluation of Prompt-based Generative Models [15.40817940713399]
We introduce the Conditional-Vendi score based on $H(X|T)$ to quantify the internal diversity of the model. We conduct several numerical experiments to show the correlation between the Conditional-Vendi score and the internal diversity of text-conditioned generative models.
arXiv Detail & Related papers (2024-11-05T05:30:39Z)
A Multi-Armed Bandit Approach to Online Selection and Evaluation of Generative Models [23.91197677628145]
In this work, we propose an online evaluation and selection framework to find the generative model that maximizes a standard assessment score. Specifically, we develop the MAB-based selection of generative models considering the Fr'echet Distance (FD) and Inception Score (IS) metrics. Our empirical results suggest the efficacy of MAB approaches for the sample-efficient evaluation and selection of deep generative models.
arXiv Detail & Related papers (2024-06-11T16:57:48Z)
Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe [10.34105218186634]
In this paper, we study how to contrastively train text embedding models in a compute-optimal fashion. Our innovation is an algorithm that produces optimal configurations of model sizes, data quantities, and fine-tuning methods for text-embedding models at different computational budget levels.
arXiv Detail & Related papers (2024-06-06T15:22:33Z)
Contrastive Transformer Learning with Proximity Data Generation for Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery. Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data. In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z)
Beyond MLE: Convex Learning for Text Generation [34.99340118597274]
We argue that Maximum likelihood estimation (MLE) is not always necessary and optimal, especially for closed-ended text generation tasks like machine translation. We propose a novel class of training objectives based on convex functions, which enables text generation models to focus on highly probable outputs without having to estimate the entire data distribution.
arXiv Detail & Related papers (2023-10-26T08:08:43Z)
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model [93.8067369210696]
Text-to-image generation (TTI) refers to the usage of models that could process text input and generate high fidelity images based on text descriptions. Diffusion models are one prominent type of generative model used for the generation of images through the systematic introduction of noises with repeating steps. In the era of large models, scaling up model size and the integration with large language models have further improved the performance of TTI models.
arXiv Detail & Related papers (2023-09-02T03:27:20Z)
Generating Images with Multimodal Language Models [78.6660334861137]
We propose a method to fuse frozen text-only large language models with pre-trained image encoder and decoder models. Our model demonstrates a wide suite of multimodal capabilities: image retrieval, novel image generation, and multimodal dialogue.
arXiv Detail & Related papers (2023-05-26T19:22:03Z)
Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models [52.29800567587504]
We propose a learnable sampling model, Text-Conditioned Token Selection (TCTS), to select optimal tokens via localized supervision with text information. TCTS improves not only the image quality but also the semantic alignment of the generated images with the given texts. We validate the efficacy of TCTS combined with Frequency Adaptive Sampling (FAS) with various generative tasks, demonstrating that it significantly outperforms the baselines in image-text alignment and image quality.
arXiv Detail & Related papers (2023-04-04T03:52:49Z)
Lafite2: Few-shot Text-to-Image Generation [132.14211027057766]
We propose a novel method for pre-training text-to-image generation model on image-only datasets. It considers a retrieval-then-optimization procedure to synthesize pseudo text features. It can be beneficial to a wide range of settings, including the few-shot, semi-supervised and fully-supervised learning.
arXiv Detail & Related papers (2022-10-25T16:22:23Z)
Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models [77.47505141269035]
Generative Visual Prompt (PromptGen) is a framework for distributional control over pre-trained generative models. PromptGen approximats an energy-based model (EBM) and samples images in a feed-forward manner. Code is available at https://github.com/ChenWu98/Generative-Visual-Prompt.
arXiv Detail & Related papers (2022-09-14T22:55:18Z)
Self-augmented Data Selection for Few-shot Dialogue Generation [18.794770678708637]
We adopt the self-training framework to deal with the few-shot MR-to-Text generation problem. We propose a novel data selection strategy to select the data that our generation model is most uncertain about.
arXiv Detail & Related papers (2022-05-19T16:25:50Z)
Evaluation of HTR models without Ground Truth Material [2.4792948967354236]
evaluation of Handwritten Text Recognition models during their development is straightforward. But the evaluation process becomes tricky as soon as we switch from development to application. We show that lexicon-based evaluation can compete with lexicon-based methods.
arXiv Detail & Related papers (2022-01-17T01:26:09Z)
GQE-PRF: Generative Query Expansion with Pseudo-Relevance Feedback [8.142861977776256]
We propose a novel approach which effectively integrates text generation models into PRF-based query expansion. Our approach generates augmented query terms via neural text generation models conditioned on both the initial query and pseudo-relevance feedback. We evaluate the performance of our approach on information retrieval tasks using two benchmark datasets.
arXiv Detail & Related papers (2021-08-13T01:09:02Z)
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing [78.8500633981247]
This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning" Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P(y|x), prompt-based learning is based on language models that model the probability of text directly.
arXiv Detail & Related papers (2021-07-28T18:09:46Z)
Few-shot Learning for Topic Modeling [39.56814839510978]
We propose a neural network-based few-shot learning method that can learn a topic model from just a few documents. We demonstrate that the proposed method achieves better perplexity than existing methods using three real-world text document sets.
arXiv Detail & Related papers (2021-04-19T01:56:48Z)
Topical Language Generation using Transformers [4.795530213347874]
This paper presents a novel approach for Topical Language Generation (TLG) by combining a pre-trained LM with topic modeling information. We extend our model by introducing new parameters and functions to influence the quantity of the topical features presented in the generated text. Our experimental results demonstrate that our model outperforms the state-of-the-art results on coherency, diversity, and fluency while being faster in decoding.
arXiv Detail & Related papers (2021-03-11T03:45:24Z)
Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes. An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.