Comparable Demonstrations are Important in In-Context Learning: A Novel
Perspective on Demonstration Selection
- URL: http://arxiv.org/abs/2312.07476v2
- Date: Tue, 9 Jan 2024 10:08:53 GMT
- Title: Comparable Demonstrations are Important in In-Context Learning: A Novel
Perspective on Demonstration Selection
- Authors: Caoyun Fan, Jidong Tian, Yitian Li, Hao He, Yaohui Jin
- Abstract summary: In-Context Learning (ICL) is an important paradigm for adapting Large Language Models (LLMs) to downstream tasks through a few demonstrations.
This study explores the ICL mechanisms from a novel perspective, providing a deeper insight into the demonstration selection strategy for ICL.
- Score: 22.29452683679149
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-Context Learning (ICL) is an important paradigm for adapting Large
Language Models (LLMs) to downstream tasks through a few demonstrations.
Despite the great success of ICL, the limitation of the demonstration number
may lead to demonstration bias, i.e. the input-label mapping induced by LLMs
misunderstands the task's essence. Inspired by human experience, we attempt to
mitigate such bias through the perspective of the inter-demonstration
relationship. Specifically, we construct Comparable Demonstrations (CDs) by
minimally editing the texts to flip the corresponding labels, in order to
highlight the task's essence and eliminate potential spurious correlations
through the inter-demonstration comparison. Through a series of experiments on
CDs, we find that (1) demonstration bias does exist in LLMs, and CDs can
significantly reduce such bias; (2) CDs exhibit good performance in ICL,
especially in out-of-distribution scenarios. In summary, this study explores
the ICL mechanisms from a novel perspective, providing a deeper insight into
the demonstration selection strategy for ICL.
Related papers
- PICLe: Pseudo-Annotations for In-Context Learning in Low-Resource Named Entity Detection [56.916656013563355]
In-context learning (ICL) enables Large Language Models to perform tasks using few demonstrations.
We propose PICLe, a framework for in-context learning with noisy, pseudo-annotated demonstrations.
We evaluate PICLe on five biomedical NED datasets and show that, with zero human annotation, PICLe outperforms ICL in low-resource settings.
arXiv Detail & Related papers (2024-12-16T16:09:35Z) - What Makes In-context Learning Effective for Mathematical Reasoning: A Theoretical Analysis [81.15503859645149]
In this paper, we aim to theoretically analyze the impact of in-context demonstrations on large language models' reasoning performance.
We propose a straightforward, generalizable, and low-complexity demonstration selection method named LMS3.
arXiv Detail & Related papers (2024-12-11T11:38:11Z) - DemoShapley: Valuation of Demonstrations for In-Context Learning [20.26604061802236]
Large language models (LLMs) leveraging in-context learning (ICL) have set new benchmarks in few-shot learning across various tasks without needing task-specific fine-tuning.
We introduce DemoShapley which is inspired by the Data Shapley valuation theorem.
Our findings reveal that DemoShapley not only enhances model performance in terms of accuracy and fairness but also generalizes queries from domains distinct from those of the in-context demonstrations.
arXiv Detail & Related papers (2024-10-10T01:35:03Z) - Focused Large Language Models are Stable Many-Shot Learners [18.783939647966776]
In-Context Learning (ICL) enables large language models (LLMs) to achieve rapid task adaptation by learning from demonstrations.
We propose a training-free method FocusICL, which conducts triviality filtering to avoid attention being diverted by unimportant contents.
We show that FocusICL achieves an average performance improvement of 5.2% over vanilla ICL and scales well with many-shot demonstrations.
arXiv Detail & Related papers (2024-08-26T02:53:24Z) - ParaICL: Towards Robust Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.
Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.
We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z) - Ambiguity-Aware In-Context Learning with Large Language Models [27.20414960164616]
In-context learning (ICL) i.e. showing LLMs task-specific demonstrations has led to downstream gains with no task-specific fine-tuning required.
This study investigates how to select good demonstrations for ICL.
We find that it is beneficial to not only choose semantically similar ICL demonstrations but also to choose those that help resolve the inherent label ambiguity surrounding the test example.
arXiv Detail & Related papers (2023-09-14T17:48:34Z) - Scaling In-Context Demonstrations with Structured Attention [75.41845145597875]
We propose a better architectural design for in-context learning.
Structured Attention for In-Context Learning replaces the full-attention by a structured attention mechanism.
We show that SAICL achieves comparable or better performance than full attention while obtaining up to 3.4x inference speed-up.
arXiv Detail & Related papers (2023-07-05T23:26:01Z) - Iterative Forward Tuning Boosts In-Context Learning in Language Models [88.25013390669845]
In this study, we introduce a novel two-stage framework to boost in-context learning in large language models (LLMs)
Specifically, our framework delineates the ICL process into two distinct stages: Deep-Thinking and test stages.
The Deep-Thinking stage incorporates a unique attention mechanism, i.e., iterative enhanced attention, which enables multiple rounds of information accumulation.
arXiv Detail & Related papers (2023-05-22T13:18:17Z) - ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for
Document Information Extraction [56.790794611002106]
Large language models (LLMs) have demonstrated remarkable results in various natural language processing (NLP) tasks with in-context learning.
We propose a simple but effective in-context learning framework called ICL-D3IE.
Specifically, we extract the most difficult and distinct segments from hard training documents as hard demonstrations.
arXiv Detail & Related papers (2023-03-09T06:24:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.