Label-Efficient Model Selection for Text Generation
- URL: http://arxiv.org/abs/2402.07891v3
- Date: Thu, 6 Jun 2024 11:07:17 GMT
- Title: Label-Efficient Model Selection for Text Generation
- Authors: Shir Ashury-Tahan, Ariel Gera, Benjamin Sznajder, Leshem Choshen, Liat Ein-Dor, Eyal Shnarch,
- Abstract summary: We introduce DiffUse, a method to make an informed decision between candidate text generation models based on preference annotations.
In a series of experiments over hundreds of model pairs, we demonstrate that DiffUse can dramatically reduce the required number of annotations.
- Score: 14.61636207880449
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Model selection for a given target task can be costly, as it may entail extensive annotation of the quality of outputs of different models. We introduce DiffUse, an efficient method to make an informed decision between candidate text generation models based on preference annotations. DiffUse reduces the required amount of annotations, thus saving valuable time and resources in performing evaluation. DiffUse intelligently selects instances by clustering embeddings that represent the semantic differences between model outputs. Thus, it is able to identify a subset of examples that are more informative for preference decisions. Our method is model-agnostic, and can be applied to any text generation model for selecting between models, prompts and configurations. Moreover, we propose a practical iterative approach for dynamically determining how many instances to annotate. In a series of experiments over hundreds of model pairs, we demonstrate that DiffUse can dramatically reduce the required number of annotations -- by up to 75% -- while maintaining high evaluation reliability.
Related papers
- Multi-Candidate Speculative Decoding [82.05519287513444]
Large language models have shown impressive capabilities across a variety of NLP tasks, yet their generating text autoregressively is time-consuming.
One way to speed them up is speculative decoding, which generates candidate segments from a fast draft model that is then verified in parallel by the target model.
This paper proposes sampling multiple candidates from a draft model and then organising them in batches for verification.
We design algorithms for efficient multi-candidate verification while maintaining the distribution of the target model.
arXiv Detail & Related papers (2024-01-12T17:15:23Z) - Unveiling the Multi-Annotation Process: Examining the Influence of
Annotation Quantity and Instance Difficulty on Model Performance [1.7343894615131372]
We show how performance scores can vary when a dataset expands from a single annotation per instance to multiple annotations.
We propose a novel multi-annotator simulation process to generate datasets with varying annotation budgets.
arXiv Detail & Related papers (2023-10-23T05:12:41Z) - IDEAL: Influence-Driven Selective Annotations Empower In-Context
Learners in Large Language Models [66.32043210237768]
This paper introduces an influence-driven selective annotation method.
It aims to minimize annotation costs while improving the quality of in-context examples.
Experiments confirm the superiority of the proposed method on various benchmarks.
arXiv Detail & Related papers (2023-10-16T22:53:54Z) - Towards Free Data Selection with General-Purpose Models [71.92151210413374]
A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets.
Current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly.
FreeSel bypasses the heavy batch selection process, achieving a significant improvement in efficiency and being 530x faster than existing active learning methods.
arXiv Detail & Related papers (2023-09-29T15:50:14Z) - Anchor Points: Benchmarking Models with Much Fewer Examples [88.02417913161356]
In six popular language classification benchmarks, model confidence in the correct class on many pairs of points is strongly correlated across models.
We propose Anchor Point Selection, a technique to select small subsets of datasets that capture model behavior across the entire dataset.
Just several anchor points can be used to estimate model per-class predictions on all other points in a dataset with low mean absolute error.
arXiv Detail & Related papers (2023-09-14T17:45:51Z) - Evaluating Representations with Readout Model Switching [19.907607374144167]
In this paper, we propose to use the Minimum Description Length (MDL) principle to devise an evaluation metric.
We design a hybrid discrete and continuous-valued model space for the readout models and employ a switching strategy to combine their predictions.
The proposed metric can be efficiently computed with an online method and we present results for pre-trained vision encoders of various architectures.
arXiv Detail & Related papers (2023-02-19T14:08:01Z) - Selective Annotation Makes Language Models Better Few-Shot Learners [97.07544941620367]
Large language models can perform in-context learning, where they learn a new task from a few task demonstrations.
This work examines the implications of in-context learning for the creation of datasets for new natural language tasks.
We propose an unsupervised, graph-based selective annotation method, voke-k, to select diverse, representative examples to annotate.
arXiv Detail & Related papers (2022-09-05T14:01:15Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Multi-label learning for dynamic model type recommendation [13.304462985219237]
We propose a problem-independent dynamic base-classifier model recommendation for the online local pool (OLP) technique.
Our proposed framework builds a multi-label meta-classifier responsible for recommending a set of relevant model types.
Experimental results show that different data distributions favored different model types on a local scope.
arXiv Detail & Related papers (2020-04-01T16:42:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.