Refract ICL: Rethinking Example Selection in the Era of Million-Token Models
- URL: http://arxiv.org/abs/2506.12346v1
- Date: Sat, 14 Jun 2025 04:51:34 GMT
- Title: Refract ICL: Rethinking Example Selection in the Era of Million-Token Models
- Authors: Arjun R. Akula, Kazuma Hashimoto, Krishna Srinivasan, Aditi Chaudhary, Karthik Raman, Michael Bendersky,
- Abstract summary: Long-context large language models (LLMs) have enabled the use of hundreds, or even thousands, of demonstrations for in-context learning (ICL)<n>This paper investigates whether traditional ICL selection strategies, which balance the similarity of ICL examples to the test input, remain effective when utilizing a large number of demonstrations.<n>We introduce Refract ICL, a novel ICL selection algorithm specifically designed to focus LLM attention on challenging examples.
- Score: 31.1838001692089
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The emergence of long-context large language models (LLMs) has enabled the use of hundreds, or even thousands, of demonstrations for in-context learning (ICL) - a previously impractical regime. This paper investigates whether traditional ICL selection strategies, which balance the similarity of ICL examples to the test input (using a text retriever) with diversity within the ICL set, remain effective when utilizing a large number of demonstrations. Our experiments demonstrate that, while longer contexts can accommodate more examples, simply increasing the number of demonstrations does not guarantee improved performance. Smart ICL selection remains crucial, even with thousands of demonstrations. To further enhance ICL in this setting, we introduce Refract ICL, a novel ICL selection algorithm specifically designed to focus LLM attention on challenging examples by strategically repeating them within the context and incorporating zero-shot predictions as error signals. Our results show that Refract ICL significantly improves the performance of extremely long-context models such as Gemini 1.5 Pro, particularly on tasks with a smaller number of output classes.
Related papers
- Large Language Models are Demonstration Pre-Selectors for Themselves [57.101804269100185]
In-context learning (ICL) with large language models (LLMs) delivers strong few-shot performance by choosing few-shot demonstrations from the entire training data.<n>FEw yet Essential Demonstration prE-selectoR is a novel pre-selection framework that identifies a representative subset of demonstrations.<n>FEw yet Essential Demonstration prE-selectoR can reduce training data size by over 20% while maintaining performance.
arXiv Detail & Related papers (2025-06-06T12:29:03Z) - Implicit In-context Learning [37.0562059811099]
We introduce Implicit In-context Learning (I2CL), an innovative paradigm that reduces the inference cost of ICL to that of zero-shot learning with minimal information loss.<n>I2CL achieves few-shot level performance at zero-shot inference cost, and it exhibits robustness against variations in demonstration examples.
arXiv Detail & Related papers (2024-05-23T14:57:52Z) - In-Context Learning with Long-Context Models: An In-Depth Exploration [92.16922648612807]
We show that, for many datasets with large label spaces, performance continues to increase with thousands of demonstrations.<n>We show that long-context ICL can be an effective tool, and may not require long-context for encoding the demonstration set at all.
arXiv Detail & Related papers (2024-04-30T21:06:52Z) - Many-Shot In-Context Learning [58.395589302800566]
Large language models (LLMs) excel at few-shot in-context learning (ICL)
We observe significant performance gains across a wide variety of generative and discriminative tasks.
Unlike few-shot learning, many-shot learning is effective at overriding pretraining biases.
arXiv Detail & Related papers (2024-04-17T02:49:26Z) - ParaICL: Towards Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.<n>Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.<n>We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z) - Dynamic Demonstrations Controller for In-Context Learning [48.455265597575675]
In-context learning (ICL) is a new paradigm for natural language processing (NLP)<n>It is commonly believed that the number of demonstrations is positively correlated with model performance.<n>We propose a Dynamic Demonstrations Controller (D$2$Controller) which can improve the ICL performance by adjusting the number of demonstrations.
arXiv Detail & Related papers (2023-09-30T14:04:22Z) - Iterative Forward Tuning Boosts In-Context Learning in Language Models [88.25013390669845]
In this study, we introduce a novel two-stage framework to boost in-context learning in large language models (LLMs)
Specifically, our framework delineates the ICL process into two distinct stages: Deep-Thinking and test stages.
The Deep-Thinking stage incorporates a unique attention mechanism, i.e., iterative enhanced attention, which enables multiple rounds of information accumulation.
arXiv Detail & Related papers (2023-05-22T13:18:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.