Learn to Select: Exploring Label Distribution Divergence for In-Context Demonstration Selection in Text Classification
- URL: http://arxiv.org/abs/2511.10675v1
- Date: Mon, 10 Nov 2025 08:04:14 GMT
- Title: Learn to Select: Exploring Label Distribution Divergence for In-Context Demonstration Selection in Text Classification
- Authors: Ye Jiang, Taihang Wang, Youzheng Liu, Yimin Wang, Yuhan Xia, Yunfei Long,
- Abstract summary: In-context learning (ICL) for text classification has demonstrated impressive performance on large language models (LLMs)<n>We propose a two-stage demonstration selection method, TopK + Label Distribution Divergence (L2D)<n>This enables the selection of demonstrations that are not only semantically similar but also aligned in label distribution with the test input.
- Score: 9.105555204653275
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-context learning (ICL) for text classification, which uses a few input-label demonstrations to describe a task, has demonstrated impressive performance on large language models (LLMs). However, the selection of in-context demonstrations plays a crucial role and can significantly affect LLMs' performance. Most existing demonstration selection methods primarily focus on semantic similarity between test inputs and demonstrations, often overlooking the importance of label distribution alignment. To address this limitation, we propose a two-stage demonstration selection method, TopK + Label Distribution Divergence (L2D), which leverages a fine-tuned BERT-like small language model (SLM) to generate label distributions and calculate their divergence for both test inputs and candidate demonstrations. This enables the selection of demonstrations that are not only semantically similar but also aligned in label distribution with the test input. Extensive experiments across seven text classification benchmarks show that our method consistently outperforms previous demonstration selection strategies. Further analysis reveals a positive correlation between the performance of LLMs and the accuracy of the underlying SLMs used for label distribution estimation.
Related papers
- Rethinking Label Consistency of In-Context Learning: An Implicit Transductive Label Propagation Perspective [34.36815585602357]
Large language models (LLMs) perform in-context learning (ICL) with minimal supervised examples.<n>Current approaches typically employ retrieval models to select the top-K most semantically similar examples as demonstrations.<n>We propose a data synthesis method, leveraging both semantic and label information, and use TopK sampling with Synthetic Data (TopK-SD) to acquire demonstrations with consistent labels.
arXiv Detail & Related papers (2025-12-13T04:41:31Z) - On the Relationship Between the Choice of Representation and In-Context Learning [38.52385081212209]
In-context learning (ICL) is the ability of a large language model to learn a new task from a few demonstrations presented as part of the context.<n>Past studies have attributed a large portion of the success of ICL to the way these in-context demonstrations are represented.<n>We study the interaction between these two aspects in ICL, representation and learning.
arXiv Detail & Related papers (2025-10-09T15:55:28Z) - Towards Compute-Optimal Many-Shot In-Context Learning [69.38428467281862]
We propose two strategies for demonstration selection in many-shot ICL.<n>First method combines a small number of demonstrations, selected based on similarity to each test sample, with a disproportionately larger set of random demonstrations that are cached.<n>Second strategy improves the first by replacing random demonstrations with those selected using centroids derived from test sample representations via k-means clustering.
arXiv Detail & Related papers (2025-07-22T04:21:03Z) - PICLe: Pseudo-Annotations for In-Context Learning in Low-Resource Named Entity Detection [56.916656013563355]
In-context learning (ICL) enables Large Language Models to perform tasks using few demonstrations.<n>We propose PICLe, a framework for in-context learning with noisy, pseudo-annotated demonstrations.<n>We evaluate PICLe on five biomedical NED datasets and show that, with zero human annotation, PICLe outperforms ICL in low-resource settings.
arXiv Detail & Related papers (2024-12-16T16:09:35Z) - Logit Separability-Driven Samples and Multiple Class-Related Words Selection for Advancing In-Context Learning [0.0]
We introduce logit separability, a criterion to assess the clarity of both samples and class-related words at the logit level.
We find that incorporating multiple class-related words for each sample, rather than relying on a single class name, improves performance by offering a broader range of label information.
We propose LICL, a logit separability-based method that jointly organizes samples and integrates multiple class-related words into each sample-label pair.
arXiv Detail & Related papers (2024-06-16T12:11:46Z) - ParaICL: Towards Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.<n>Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.<n>We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z) - Revisiting Demonstration Selection Strategies in In-Context Learning [66.11652803887284]
Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL)
In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent.
We propose a data- and model-dependent demonstration selection method, textbfTopK + ConE, based on the assumption that textitthe performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples.
arXiv Detail & Related papers (2024-01-22T16:25:27Z) - In-Context Learning with Iterative Demonstration Selection [32.62104857810135]
Large language models (LLMs) have demonstrated strong few-shot learning ability via in-context learning (ICL)<n>The performance of ICL has been shown to be highly sensitive to the selection of few-shot demonstrations.<n>We propose Iterative Demonstration Selection (IDS) to leverage the merits of both dimensions.
arXiv Detail & Related papers (2023-10-15T16:40:19Z) - Ambiguity-Aware In-Context Learning with Large Language Models [27.20414960164616]
In-context learning (ICL) i.e. showing LLMs task-specific demonstrations has led to downstream gains with no task-specific fine-tuning required.
This study investigates how to select good demonstrations for ICL.
We find that it is beneficial to not only choose semantically similar ICL demonstrations but also to choose those that help resolve the inherent label ambiguity surrounding the test example.
arXiv Detail & Related papers (2023-09-14T17:48:34Z) - In-Context Demonstration Selection with Cross Entropy Difference [95.21947716378641]
Large language models (LLMs) can use in-context demonstrations to improve performance on zero-shot tasks.
We present a cross-entropy difference (CED) method for selecting in-context demonstrations.
arXiv Detail & Related papers (2023-05-24T05:04:00Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.