Related papers: Revisiting In-Context Learning with Long Context Language Models

Revisiting In-Context Learning with Long Context Language Models

URL: http://arxiv.org/abs/2412.16926v2
Date: Mon, 06 Jan 2025 08:45:06 GMT
Title: Revisiting In-Context Learning with Long Context Language Models
Authors: Jinheon Baek, Sun Jae Lee, Prakhar Gupta, Geunseob Oh, Siddharth Dalmia, Prateek Kolhar,
Abstract summary: In-Context Learning (ICL) is a technique by which language models make predictions based on examples provided in their input context.<n>The advent of Long Context Language Models (LCLMs) has significantly increased the number of examples that can be included in context.<n>We revisit these approaches in the context of LCLMs through extensive experiments on 18 datasets spanning 4 tasks.
Score: 26.141121450077637
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In-Context Learning (ICL) is a technique by which language models make predictions based on examples provided in their input context. Previously, their context window size imposed a limit on the number of examples that can be shown, making example selection techniques crucial for identifying the maximally effective set of examples. However, the recent advent of Long Context Language Models (LCLMs) has significantly increased the number of examples that can be included in context, raising an important question of whether ICL performance in a many-shot regime is still sensitive to the method of sample selection. To answer this, we revisit these approaches in the context of LCLMs through extensive experiments on 18 datasets spanning 4 tasks. Surprisingly, we observe that sophisticated example selection techniques do not yield significant improvements over a simple random sample selection method. Instead, we find that the advent of LCLMs has fundamentally shifted the challenge of ICL from that of selecting the most effective examples to that of collecting sufficient examples to fill the context window. Specifically, in certain datasets, including all available examples does not fully utilize the context window; however, by augmenting the examples in context with a simple data augmentation approach, we substantially improve ICL performance by 5%.

Related papers

MAPLE: Many-Shot Adaptive Pseudo-Labeling for In-Context Learning [53.02571749383208]
In-Context Learning (ICL) empowers Large Language Models (LLMs) to tackle diverse tasks by incorporating multiple input-output examples.<n>Many-Shot Adaptive Pseudo-LabEling (MAPLE) is a novel influence-based many-shot ICL framework that utilizes pseudo-labeled samples to compensate for the lack of label information.
arXiv Detail & Related papers (2025-05-22T04:54:27Z)
PromptRefine: Enhancing Few-Shot Performance on Low-Resource Indic Languages with Example Selection from Related Example Banks [57.86928556668849]
Large Language Models (LLMs) have recently demonstrated impressive few-shot learning capabilities through in-context learning (ICL)<n>ICL performance is highly dependent on the choice of few-shot demonstrations, making the selection of the most optimal examples a persistent research challenge.<n>In this work, we propose PromptRefine, a novel Alternating Minimization approach for example selection that improves ICL performance on low-resource Indic languages.
arXiv Detail & Related papers (2024-12-07T17:51:31Z)
Large Language Models Know What Makes Exemplary Contexts [42.90814615222177]
In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs) This paper presents a unified framework for LLMs that allows them to self-select influential in-context examples to compose their contexts.
arXiv Detail & Related papers (2024-08-14T12:32:41Z)
Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations [44.24067814871803]
In-Context Learning (ICL) with few-shot examples may be an appealing solution to enhance LLM performance in long contexts. We propose to automatically generate few-shot examples for long context QA tasks by recycling contexts.
arXiv Detail & Related papers (2024-06-19T15:28:29Z)
Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars [66.823588073584]
Large language models (LLMs) have shown impressive capabilities in real-world applications. The quality of these exemplars in the prompt greatly impacts performance. Existing methods fail to adequately account for the impact of exemplar ordering on the performance.
arXiv Detail & Related papers (2024-05-25T08:23:05Z)
ParaICL: Towards Robust Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing. Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples. We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z)
One size doesn't fit all: Predicting the Number of Examples for In-Context Learning [16.712595387955574]
In-context learning (ICL) refers to the process of adding a small number of localized examples from a training set of labelled data to an LLM's prompt. Our work alleviates the limitations of this 'one fits all' approach by dynamically predicting the number of examples for each data instance to be used in few-shot inference. Our experiments on a number of text classification benchmarks show that AICL substantially outperforms standard ICL by up to 17%.
arXiv Detail & Related papers (2024-03-11T03:28:13Z)
Which Examples to Annotate for In-Context Learning? Towards Effective and Efficient Selection [35.924633625147365]
Large Language Models (LLMs) can adapt to new tasks via in-context learning (ICL) In this work, we investigate an active learning approach for ICL, where there is a limited budget for annotating examples. We propose a model-adaptive optimization-free algorithm, termed AdaICL, which identifies examples that the model is uncertain about.
arXiv Detail & Related papers (2023-10-30T22:03:55Z)
RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning [53.52699766206808]
We propose Retrieval for In-Context Learning (RetICL), a learnable method for modeling and optimally selecting examples sequentially for in-context learning. We evaluate RetICL on math word problem solving and scientific question answering tasks and show that it consistently outperforms or matches and learnable baselines.
arXiv Detail & Related papers (2023-05-23T20:15:56Z)
Active Learning Principles for In-Context Learning with Large Language Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning. We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z)
Finding Support Examples for In-Context Learning [73.90376920653507]
We propose LENS, a fiLter-thEN-Search method to tackle this challenge in two stages. First we filter the dataset to obtain informative in-context examples individually. Then we propose diversity-guided example search which iteratively refines and evaluates the selected example permutations.
arXiv Detail & Related papers (2023-02-27T06:32:45Z)
Compositional Exemplars for In-context Learning [21.961094715261133]
Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability. We propose CEIL (Compositional Exemplars for In-context Learning) to model the interaction between the given input and in-context examples. We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing.
arXiv Detail & Related papers (2023-02-11T14:02:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.