Related papers: Efficient and Effective In-context Demonstration Selection with Coreset

Efficient and Effective In-context Demonstration Selection with Coreset

URL: http://arxiv.org/abs/2511.08977v1
Date: Thu, 13 Nov 2025 01:23:23 GMT
Title: Efficient and Effective In-context Demonstration Selection with Coreset
Authors: Zihua Wang, Jiarui Wang, Haiyang Xu, Ming Yan, Fei Huang, Xu Yang, Xiu-Shen Wei, Siya Mi, Yu Zhang,
Abstract summary: In-context learning (ICL) has emerged as a powerful paradigm for Large Visual Language Models (LVLMs)<n>In this paper, we propose a novel demonstration selection framework named Coreset-based Dual Retrieval (CoDR)
Score: 46.77227297059547
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In-context learning (ICL) has emerged as a powerful paradigm for Large Visual Language Models (LVLMs), enabling them to leverage a few examples directly from input contexts. However, the effectiveness of this approach is heavily reliant on the selection of demonstrations, a process that is NP-hard. Traditional strategies, including random, similarity-based sampling and infoscore-based sampling, often lead to inefficiencies or suboptimal performance, struggling to balance both efficiency and effectiveness in demonstration selection. In this paper, we propose a novel demonstration selection framework named Coreset-based Dual Retrieval (CoDR). We show that samples within a diverse subset achieve a higher expected mutual information. To implement this, we introduce a cluster-pruning method to construct a diverse coreset that aligns more effectively with the query while maintaining diversity. Additionally, we develop a dual retrieval mechanism that enhances the selection process by achieving global demonstration selection while preserving efficiency. Experimental results demonstrate that our method significantly improves the ICL performance compared to the existing strategies, providing a robust solution for effective and efficient demonstration selection.

Related papers

Towards Compute-Optimal Many-Shot In-Context Learning [69.38428467281862]
We propose two strategies for demonstration selection in many-shot ICL.<n>First method combines a small number of demonstrations, selected based on similarity to each test sample, with a disproportionately larger set of random demonstrations that are cached.<n>Second strategy improves the first by replacing random demonstrations with those selected using centroids derived from test sample representations via k-means clustering.
arXiv Detail & Related papers (2025-07-22T04:21:03Z)
Large Language Models are Demonstration Pre-Selectors for Themselves [57.101804269100185]
In-context learning (ICL) with large language models (LLMs) delivers strong few-shot performance by choosing few-shot demonstrations from the entire training data.<n>FEw yet Essential Demonstration prE-selectoR is a novel pre-selection framework that identifies a representative subset of demonstrations.<n>FEw yet Essential Demonstration prE-selectoR can reduce training data size by over 20% while maintaining performance.
arXiv Detail & Related papers (2025-06-06T12:29:03Z)
Learning to Select In-Context Demonstration Preferred by Large Language Model [21.077656767563255]
In-context learning (ICL) enables large language models to adapt to new tasks during inference using only a few demonstrations.<n>We propose GenICL, a novel generative preference learning framework that leverages LLM feedback to directly optimize demonstration selection for ICL.<n>Experiments on 19 datasets across 11 task categories demonstrate that GenICL achieves superior performance than existing methods in selecting the most effective demonstrations.
arXiv Detail & Related papers (2025-05-26T13:26:56Z)
Comparative Analysis of Demonstration Selection Algorithms for LLM In-Context Learning [18.58278188791548]
In-context learning can help Large Language Models (LLMs) to adapt new tasks without additional training. Despite all the proposed demonstration selection algorithms, efficiency and effectiveness remain unclear. This lack of clarity makes it difficult to apply these algorithms in real-world scenarios.
arXiv Detail & Related papers (2024-10-30T15:11:58Z)
Large Language Models Know What Makes Exemplary Contexts [42.90814615222177]
In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs) This paper presents a unified framework for LLMs that allows them to self-select influential in-context examples to compose their contexts.
arXiv Detail & Related papers (2024-08-14T12:32:41Z)
ParaICL: Towards Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.<n>Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.<n>We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z)
Revisiting Demonstration Selection Strategies in In-Context Learning [66.11652803887284]
Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL) In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent. We propose a data- and model-dependent demonstration selection method, textbfTopK + ConE, based on the assumption that textitthe performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples.
arXiv Detail & Related papers (2024-01-22T16:25:27Z)
In-Context Demonstration Selection with Cross Entropy Difference [95.21947716378641]
Large language models (LLMs) can use in-context demonstrations to improve performance on zero-shot tasks. We present a cross-entropy difference (CED) method for selecting in-context demonstrations.
arXiv Detail & Related papers (2023-05-24T05:04:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.