Related papers: 'One size doesn't fit all': Learning how many Examples to use for In-Context Learning for Improved Text Classification

'One size doesn't fit all': Learning how many Examples to use for In-Context Learning for Improved Text Classification

URL: http://arxiv.org/abs/2403.06402v1
Date: Mon, 11 Mar 2024 03:28:13 GMT
Title: 'One size doesn't fit all': Learning how many Examples to use for In-Context Learning for Improved Text Classification
Authors: Manish Chandra, Debasis Ganguly, Yiwen Li, Iadh Ounis
Abstract summary: In-context learning (ICL) uses a small number of labelled data instances as examples in the prompt. We propose a novel methodology of dynamically adapting the number of examples as per the data. Our experiments show that our AICL method results in improvement in text classification task on several standard datasets.
Score: 18.167541508658417
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Predictive models in natural language processing (NLP) have evolved from training models from scratch to fine-tuning pre-trained models with labelled data. An extreme form of this fine-tuning involves in-context learning (ICL), where the output of a pre-trained generative model (frozen decoder parameters) is controlled only with variations in the input strings (called instructions or prompts). An important component of ICL is the use of a small number of labelled data instances as examples in the prompt. While existing work uses a static number of examples during inference for each data instance, in this paper we propose a novel methodology of dynamically adapting the number of examples as per the data. This is analogous to the use of a variable-sized neighborhood in k-nearest neighbors (k-NN) classifier. In our proposed workflow of adaptive ICL (AICL), the number of demonstrations to employ during the inference on a particular data instance is predicted by the Softmax posteriors of a classifier. The parameters of this classifier are fitted on the optimal number of examples in ICL required to correctly infer the label of each instance in the training set with the hypothesis that a test instance that is similar to a training instance should use the same (or a closely matching) number of few-shot examples. Our experiments show that our AICL method results in improvement in text classification task on several standard datasets.

Related papers

Large Language Models are Demonstration Pre-Selectors for Themselves [57.101804269100185]
In-context learning (ICL) with large language models (LLMs) delivers strong few-shot performance by choosing few-shot demonstrations from the entire training data.<n>FEw yet Essential Demonstration prE-selectoR is a novel pre-selection framework that identifies a representative subset of demonstrations.<n>FEw yet Essential Demonstration prE-selectoR can reduce training data size by over 20% while maintaining performance.
arXiv Detail & Related papers (2025-06-06T12:29:03Z)
MAPLE: Many-Shot Adaptive Pseudo-Labeling for In-Context Learning [53.02571749383208]
In-Context Learning (ICL) empowers Large Language Models (LLMs) to tackle diverse tasks by incorporating multiple input-output examples.<n>Many-Shot Adaptive Pseudo-LabEling (MAPLE) is a novel influence-based many-shot ICL framework that utilizes pseudo-labeled samples to compensate for the lack of label information.
arXiv Detail & Related papers (2025-05-22T04:54:27Z)
Revisiting In-Context Learning with Long Context Language Models [26.141121450077637]
In-Context Learning (ICL) is a technique by which language models make predictions based on examples provided in their input context. The advent of Long Context Language Models (LCLMs) has significantly increased the number of examples that can be included in context. We revisit these approaches in the context of LCLMs through extensive experiments on 18 datasets spanning 4 tasks.
arXiv Detail & Related papers (2024-12-22T08:55:19Z)
Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods [69.36397993451742]
This work introduces Context-aware Prompt Tuning (CPT), a method inspired by ICL, PT, and adversarial attacks. We modify specific context tokens, considering the unique structure of input and output formats. Inspired by adversarial attacks, we adjust the input based on the labels present in the context, focusing on minimizing, rather than maximizing, the loss.
arXiv Detail & Related papers (2024-10-22T17:45:47Z)
Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars [66.823588073584]
Large language models (LLMs) have shown impressive capabilities in real-world applications. The quality of these exemplars in the prompt greatly impacts performance. Existing methods fail to adequately account for the impact of exemplar ordering on the performance.
arXiv Detail & Related papers (2024-05-25T08:23:05Z)
"In-Context Learning" or: How I learned to stop worrying and love "Applied Information Retrieval" [9.264121218481133]
In-context learning (ICL) has evolved as a new paradigm for natural language processing (NLP) ICL is conceptually similar to a non-parametric approach, such as $k$-NN. Similar examples in ICL retrieved from a training set relate to a set of documents retrieved from a collection in IR.
arXiv Detail & Related papers (2024-05-02T09:25:24Z)
Experimental Design for Active Transductive Inference in Large Language Models [18.2671641610825]
We use active learning for adaptive prompt design and call it Active In-context Prompt Design (AIPD) We design the LLM prompt by adaptively choosing few-shot examples from a training set to optimize performance on a test set. We propose two algorithms, GO and SAL, which differ in how the few-shot examples are chosen.
arXiv Detail & Related papers (2024-04-12T23:27:46Z)
ParaICL: Towards Robust Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing. Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples. We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z)
Estimating Large Language Model Capabilities without Labeled Test Data [51.428562302037534]
Large Language Models (LLMs) have the impressive ability to perform in-context learning (ICL) from only a few examples. We propose the task of ICL accuracy estimation, in which we predict the accuracy of an LLM when doing in-context learning on a new task.
arXiv Detail & Related papers (2023-05-24T06:55:09Z)
Data Curation Alone Can Stabilize In-context Learning [20.874674130060388]
In-context learning (ICL) enables large language models to perform new tasks by prompting them with a sequence of training examples. randomly sampling examples from a training set leads to high variance in performance. We show that carefully curating a subset of training data greatly stabilizes ICL performance without any other changes to the ICL algorithm.
arXiv Detail & Related papers (2022-12-20T15:58:54Z)
Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models. Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z)
Contrastive Learning with Adversarial Examples [79.39156814887133]
Contrastive learning (CL) is a popular technique for self-supervised learning (SSL) of visual representations. This paper introduces a new family of adversarial examples for constrastive learning and using these examples to define a new adversarial training algorithm for SSL, denoted as CLAE.
arXiv Detail & Related papers (2020-10-22T20:45:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.