Related papers: True Few-Shot Learning with Language Models

True Few-Shot Learning with Language Models

URL: http://arxiv.org/abs/2105.11447v1
Date: Mon, 24 May 2021 17:55:51 GMT
Title: True Few-Shot Learning with Language Models
Authors: Ethan Perez, Douwe Kiela, Kyunghyun Cho
Abstract summary: We evaluate the few-shot ability of LMs when held-out examples are unavailable. Our findings suggest that prior work significantly overestimated the true few-shot ability of LMs.
Score: 78.42578316883271
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pretrained language models (LMs) perform well on many tasks even when learning from a few examples, but prior work uses many held-out examples to tune various aspects of learning, such as hyperparameters, training objectives, and natural language templates ("prompts"). Here, we evaluate the few-shot ability of LMs when such held-out examples are unavailable, a setting we call true few-shot learning. We test two model selection criteria, cross-validation and minimum description length, for choosing LM prompts and hyperparameters in the true few-shot setting. On average, both marginally outperform random selection and greatly underperform selection based on held-out examples. Moreover, selection criteria often prefer models that perform significantly worse than randomly-selected ones. We find similar results even when taking into account our uncertainty in a model's true performance during selection, as well as when varying the amount of computation and number of examples used for selection. Overall, our findings suggest that prior work significantly overestimated the true few-shot ability of LMs given the difficulty of few-shot model selection.

Related papers

Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process. vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner. We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z)
What Makes Good Few-shot Examples for Vision-Language Models? [29.620987070958318]
We introduce two innovative selection methods - Representativeness (REPRE) and Gaussian Monte Carlo (Montecarlo) Our findings demonstrate that both REPRE and Montecarlo significantly surpass both random selection and AL-based strategies in few-shot training scenarios. The research also underscores that these instance selection methods are model-agnostic, offering a versatile enhancement to a wide array of few-shot training methodologies.
arXiv Detail & Related papers (2024-05-22T11:03:33Z)
Experimental Design for Active Transductive Inference in Large Language Models [18.2671641610825]
We use active learning for adaptive prompt design and call it Active In-context Prompt Design (AIPD) We design the LLM prompt by adaptively choosing few-shot examples from a training set to optimize performance on a test set. We propose two algorithms, GO and SAL, which differ in how the few-shot examples are chosen.
arXiv Detail & Related papers (2024-04-12T23:27:46Z)
In-Context Learning with Iterative Demonstration Selection [32.62104857810135]
Large language models (LLMs) have demonstrated strong few-shot learning ability via in-context learning (ICL) The performance of ICL has been shown to be highly sensitive to the selection of few-shot demonstrations. We propose Iterative Demonstration Selection (IDS) to leverage the merits of both dimensions.
arXiv Detail & Related papers (2023-10-15T16:40:19Z)
Large Language Models Are Not Robust Multiple Choice Selectors [117.72712117510953]
Multiple choice questions (MCQs) serve as a common yet important task format in the evaluation of large language models (LLMs) This work shows that modern LLMs are vulnerable to option position changes due to their inherent "selection bias" We propose a label-free, inference-time debiasing method, called PriDe, which separates the model's prior bias for option IDs from the overall prediction distribution.
arXiv Detail & Related papers (2023-09-07T17:44:56Z)
RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning [53.52699766206808]
We propose Retrieval for In-Context Learning (RetICL), a learnable method for modeling and optimally selecting examples sequentially for in-context learning. We evaluate RetICL on math word problem solving and scientific question answering tasks and show that it consistently outperforms or matches and learnable baselines.
arXiv Detail & Related papers (2023-05-23T20:15:56Z)
Active Learning Principles for In-Context Learning with Large Language Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning. We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z)
Skill-Based Few-Shot Selection for In-Context Learning [123.26522773708683]
Skill-KNN is a skill-based few-shot selection method for in-context learning. It does not require training or fine-tuning of any models, making it suitable for frequently expanding or changing example banks. Experimental results across five cross-domain semantic parsing datasets and six backbone models show that Skill-KNN significantly outperforms existing methods.
arXiv Detail & Related papers (2023-05-23T16:28:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.