Related papers: Which Examples to Annotate for In-Context Learning? Towards Effective and Efficient Selection

Which Examples to Annotate for In-Context Learning? Towards Effective and Efficient Selection

URL: http://arxiv.org/abs/2310.20046v1
Date: Mon, 30 Oct 2023 22:03:55 GMT
Title: Which Examples to Annotate for In-Context Learning? Towards Effective and Efficient Selection
Authors: Costas Mavromatis, Balasubramaniam Srinivasan, Zhengyuan Shen, Jiani Zhang, Huzefa Rangwala, Christos Faloutsos, George Karypis
Abstract summary: Large Language Models (LLMs) can adapt to new tasks via in-context learning (ICL) In this work, we investigate an active learning approach for ICL, where there is a limited budget for annotating examples. We propose a model-adaptive optimization-free algorithm, termed AdaICL, which identifies examples that the model is uncertain about.
Score: 35.924633625147365
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) can adapt to new tasks via in-context learning (ICL). ICL is efficient as it does not require any parameter updates to the trained LLM, but only few annotated examples as input for the LLM. In this work, we investigate an active learning approach for ICL, where there is a limited budget for annotating examples. We propose a model-adaptive optimization-free algorithm, termed AdaICL, which identifies examples that the model is uncertain about, and performs semantic diversity-based example selection. Diversity-based sampling improves overall effectiveness, while uncertainty sampling improves budget efficiency and helps the LLM learn new information. Moreover, AdaICL poses its sampling strategy as a Maximum Coverage problem, that dynamically adapts based on the model's feedback and can be approximately solved via greedy algorithms. Extensive experiments on nine datasets and seven LLMs show that AdaICL improves performance by 4.4% accuracy points over SOTA (7.7% relative improvement), is up to 3x more budget-efficient than performing annotations uniformly at random, while it outperforms SOTA with 2x fewer ICL examples.

Related papers

Large Language Models are Demonstration Pre-Selectors for Themselves [57.101804269100185]
In-context learning (ICL) with large language models (LLMs) delivers strong few-shot performance by choosing few-shot demonstrations from the entire training data.<n>FEw yet Essential Demonstration prE-selectoR is a novel pre-selection framework that identifies a representative subset of demonstrations.<n>FEw yet Essential Demonstration prE-selectoR can reduce training data size by over 20% while maintaining performance.
arXiv Detail & Related papers (2025-06-06T12:29:03Z)
FisherSFT: Data-Efficient Supervised Fine-Tuning of Language Models Using Information Gain [14.109309236798518]
Supervised fine-tuning (SFT) is a standard approach to adapting large language models (LLMs) to new domains.<n>In this work, we improve the statistical efficiency of SFT by selecting an informative subset of training examples.
arXiv Detail & Related papers (2025-05-20T18:41:34Z)
LLM-Independent Adaptive RAG: Let the Question Speak for Itself [47.60917219813637]
Large Language Models (LLMs) are prone to hallucinations, and Retrieval-Augmented Generation (RAG) helps this, but at a high computational cost while risking misinformation.<n>In this study, we introduce lightweight LLM-independent adaptive retrieval methods based on external information.
arXiv Detail & Related papers (2025-05-07T08:58:52Z)
LLMs as Data Annotators: How Close Are We to Human Performance [47.61698665650761]
Manual annotation of data is labor-intensive, time-consuming, and costly. In-context learning (ICL) in which some examples related to the task are given in the prompt can lead to inefficiencies and suboptimal model performance. This paper presents experiments comparing several LLMs, considering different embedding models, across various datasets for the Named Entity Recognition (NER) task.
arXiv Detail & Related papers (2025-04-21T11:11:07Z)
Adaptive Pruning for Large Language Models with Structural Importance Awareness [66.2690963378878]
Large language models (LLMs) have significantly improved language understanding and generation capabilities. LLMs are difficult to deploy on resource-constrained edge devices due to their high computational and storage resource demands. We propose structurally-aware adaptive pruning (SAAP) to significantly reduce the computational and memory costs while maintaining model performance.
arXiv Detail & Related papers (2024-12-19T18:08:04Z)
Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning [19.16587730306472]
In-Context Learning (ICL) emerges as a key feature for Large Language Models (LLMs) We propose Logit Arithmetic Reweighting Approach (LARA), a novel framework that enhances ICL by using logit-based ensembling of multiple demonstrations.
arXiv Detail & Related papers (2024-10-14T01:34:16Z)
EVOLvE: Evaluating and Optimizing LLMs For Exploration [76.66831821738927]
Large language models (LLMs) remain under-studied in scenarios requiring optimal decision-making under uncertainty. We measure LLMs' (in)ability to make optimal decisions in bandits, a state-less reinforcement learning setting relevant to many applications. Motivated by the existence of optimal exploration algorithms, we propose efficient ways to integrate this algorithmic knowledge into LLMs.
arXiv Detail & Related papers (2024-10-08T17:54:03Z)
In-Context Learning with Reinforcement Learning for Incomplete Utterance Rewriting [33.89176174108559]
In-context learning of large language models (LLMs) makes predictions only based on instructions augmented with a few examples. Existing example selection methods for ICL utilize sparse or dense retrievers and derive effective performance. We propose our policy-based reinforcement learning framework for example selection (RLS), which consists of a language model (LM) selector and an LLM generator.
arXiv Detail & Related papers (2024-08-23T12:32:12Z)
SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models [8.558834738072363]
Large language models (LLMs) have gained increased popularity due to their remarkable success across various tasks. However, individual LLMs have limitations when applied to complex tasks because of such factors as training biases, model sizes, and the datasets used. We introduce SelectLLM, a novel algorithm that directs input queries to the most suitable subset of LLMs from a large pool.
arXiv Detail & Related papers (2024-08-16T06:11:21Z)
Large Language Models Know What Makes Exemplary Contexts [42.90814615222177]
In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs) This paper presents a unified framework for LLMs that allows them to self-select influential in-context examples to compose their contexts.
arXiv Detail & Related papers (2024-08-14T12:32:41Z)
Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios. We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples. Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z)
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs) We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z)
Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars [66.823588073584]
Large language models (LLMs) have shown impressive capabilities in real-world applications. The quality of these exemplars in the prompt greatly impacts performance. Existing methods fail to adequately account for the impact of exemplar ordering on the performance.
arXiv Detail & Related papers (2024-05-25T08:23:05Z)
ParaICL: Towards Robust Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing. Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples. We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z)
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets. Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z)
Data Curation Alone Can Stabilize In-context Learning [20.874674130060388]
In-context learning (ICL) enables large language models to perform new tasks by prompting them with a sequence of training examples. randomly sampling examples from a training set leads to high variance in performance. We show that carefully curating a subset of training data greatly stabilizes ICL performance without any other changes to the ICL algorithm.
arXiv Detail & Related papers (2022-12-20T15:58:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.