Sample Efficient Demonstration Selection for In-Context Learning
- URL: http://arxiv.org/abs/2506.08607v1
- Date: Tue, 10 Jun 2025 09:16:11 GMT
- Title: Sample Efficient Demonstration Selection for In-Context Learning
- Authors: Kiran Purohit, V Venktesh, Sourangshu Bhattacharya, Avishek Anand,
- Abstract summary: In this paper, we formulate the exemplar selection task as a top-m best arms identification problem.<n>A key challenge in this setup is the exponentially large number of arms that need to be evaluated to identify the m-best arms.<n>We propose CASE, a novel sample-efficient selective exploration strategy that maintains a shortlist of "challenger" arms.
- Score: 4.637359990120997
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The in-context learning paradigm with LLMs has been instrumental in advancing a wide range of natural language processing tasks. The selection of few-shot examples (exemplars / demonstration samples) is essential for constructing effective prompts under context-length budget constraints. In this paper, we formulate the exemplar selection task as a top-m best arms identification problem. A key challenge in this setup is the exponentially large number of arms that need to be evaluated to identify the m-best arms. We propose CASE (Challenger Arm Sampling for Exemplar selection), a novel sample-efficient selective exploration strategy that maintains a shortlist of "challenger" arms, which are current candidates for the top-m arms. In each iteration, only one of the arms from this shortlist or the current topm set is pulled, thereby reducing sample complexity and, consequently, the number of LLM evaluations. Furthermore, we model the scores of exemplar subsets (arms) using a parameterized linear scoring function, leading to stochastic linear bandits setting. CASE achieves remarkable efficiency gains of up to 7x speedup in runtime while requiring 7x fewer LLM calls (87% reduction) without sacrificing performance compared to state-of-the-art exemplar selection methods. We release our code and data at https://github.com/kiranpurohit/CASE
Related papers
- Learning to Reason Across Parallel Samples for LLM Reasoning [45.60752271688715]
Scaling test-time compute brings substantial performance gains for large language models.<n>We propose a new way to leverage such multiple sample set.<n>We train a compact LLM, that takes a sequence of multiple samples and output the final answer.
arXiv Detail & Related papers (2025-06-10T17:42:35Z) - Large Language Models are Demonstration Pre-Selectors for Themselves [57.101804269100185]
In-context learning (ICL) with large language models (LLMs) delivers strong few-shot performance by choosing few-shot demonstrations from the entire training data.<n>FEw yet Essential Demonstration prE-selectoR is a novel pre-selection framework that identifies a representative subset of demonstrations.<n>FEw yet Essential Demonstration prE-selectoR can reduce training data size by over 20% while maintaining performance.
arXiv Detail & Related papers (2025-06-06T12:29:03Z) - Sample, Don't Search: Rethinking Test-Time Alignment for Language Models [55.2480439325792]
We introduce QAlign, a new test-time alignment approach.<n>As we scale test-time compute, QAlign converges to sampling from the optimal aligned distribution for each individual prompt.<n>By adopting recent advances in Markov chain Monte Carlo for text generation, our method enables better-aligned outputs without modifying the underlying model or even requiring logit access.
arXiv Detail & Related papers (2025-04-04T00:41:40Z) - EXPLORA: Efficient Exemplar Subset Selection for Complex Reasoning [5.172620636569522]
Large language models (LLMs) have enabled in-context learning (ICL), allowing LLMs to acquire proficiency in a specific task using only a few demonstration samples (exemplars)
A critical challenge in ICL is the selection of optimal exemplars, which can be either task-specific (static) or test-example-specific (dynamic)
arXiv Detail & Related papers (2024-11-06T12:48:04Z) - On Speeding Up Language Model Evaluation [48.51924035873411]
We propose an $textitadaptive$ approach to explore this space.<n>We lean on multi-armed bandits to sequentially identify the next (method, validation sample)-pair to evaluate.<n>We show that it can identify the top-performing method using only 5-15% of the typical resources.
arXiv Detail & Related papers (2024-07-08T17:48:42Z) - Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars [66.823588073584]
Large language models (LLMs) have shown impressive capabilities in real-world applications.
The quality of these exemplars in the prompt greatly impacts performance.
Existing methods fail to adequately account for the impact of exemplar ordering on the performance.
arXiv Detail & Related papers (2024-05-25T08:23:05Z) - Experimental Design for Active Transductive Inference in Large Language Models [18.2671641610825]
We use active learning for adaptive prompt design and call it Active In-context Prompt Design (AIPD)
We design the LLM prompt by adaptively choosing few-shot examples from a training set to optimize performance on a test set.
We propose two algorithms, GO and SAL, which differ in how the few-shot examples are chosen.
arXiv Detail & Related papers (2024-04-12T23:27:46Z) - RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning [53.52699766206808]
We propose Retrieval for In-Context Learning (RetICL), a learnable method for modeling and optimally selecting examples sequentially for in-context learning.
We evaluate RetICL on math word problem solving and scientific question answering tasks and show that it consistently outperforms or matches and learnable baselines.
arXiv Detail & Related papers (2023-05-23T20:15:56Z) - Max-Utility Based Arm Selection Strategy For Sequential Query
Recommendations [16.986870945319293]
We consider the query recommendation problem in closed loop interactive learning settings like online information gathering and exploratory analytics.
The problem can be naturally modelled using the Multi-Armed Bandits (MAB) framework with countably many arms.
We show that such a selection strategy often results in higher cumulative regret and to this end, we propose a selection strategy based on the maximum utility of the arms.
arXiv Detail & Related papers (2021-08-31T13:03:30Z) - True Few-Shot Learning with Language Models [78.42578316883271]
We evaluate the few-shot ability of LMs when held-out examples are unavailable.
Our findings suggest that prior work significantly overestimated the true few-shot ability of LMs.
arXiv Detail & Related papers (2021-05-24T17:55:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.