Misconfidence-based Demonstration Selection for LLM In-Context Learning
- URL: http://arxiv.org/abs/2401.06301v1
- Date: Fri, 12 Jan 2024 00:11:24 GMT
- Title: Misconfidence-based Demonstration Selection for LLM In-Context Learning
- Authors: Shangqing Xu, Chao Zhang (Georgia Institute of Technology)
- Abstract summary: In-context learning with large language models (LLMs) excels at adapting to various tasks rapidly.
Current approaches to this problem either rely on hard-to-acquire external supervision or require frequent interactions with LLMs.
We propose a new method called In-Context Reflection (ICR) to overcome these challenges.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-context learning with large language models (LLMs) excels at adapting to
various tasks rapidly. However, its success hinges on carefully selecting
demonstrations, which remains an obstacle in practice. Current approaches to
this problem either rely on hard-to-acquire external supervision or require
frequent interactions with LLMs, resulting in high costs. We propose a new
method called In-Context Reflection (ICR) to overcome these challenges. ICR
strategically selects demonstrations to reduce the discrepancy between the
LLM's outputs and the actual input-output mappings. Specifically, ICR starts
with a random set of initial demonstrations, then iteratively refines it. In
each step, it analyzes a pool of candidate examples and identifies the ones
most likely to challenge the LLM's current understanding, measured by a new
metric called misconfidence. These most confusing examples are then selected to
replace the less informative demonstrations in the current set. Our
comprehensive evaluation across five diverse datasets encompassing 13 subtasks
shows the efficacy of ICR. Compared to existing methods, ICR achieves an
average performance boost of 4%, while demonstrating remarkable cross-task
generalization capabilities.
Related papers
- Large Language Models Know What Makes Exemplary Contexts [42.90814615222177]
In-context learning (ICL) has proven to be a significant capability with the advancement of Large Language models (LLMs)
This paper presents a unified framework for LLMs that allows them to self-select influential in-context examples to compose their contexts.
arXiv Detail & Related papers (2024-08-14T12:32:41Z) - Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing.
Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image.
To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Online Cascade Learning for Efficient Inference over Streams [9.516197133796437]
Large Language Models (LLMs) have a natural role in answering complex queries about data streams.
We propose online cascade learning, the first approach to address this challenge.
We formulate the task of learning cascades online as an imitation-learning problem.
arXiv Detail & Related papers (2024-02-07T01:46:50Z) - TRACE: A Comprehensive Benchmark for Continual Learning in Large
Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety.
Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs.
We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z) - Towards LLM-based Fact Verification on News Claims with a Hierarchical
Step-by-Step Prompting Method [9.099277246096861]
In this paper, we examine large pre-trained language models (LLMs) with in-context learning (ICL) for news claim verification.
We introduce a Hierarchical Step-by-Step (HiSS) prompting method which directs LLMs to separate a claim into several subclaims and then verify each of them via multiple questions-answering steps progressively.
Experiment results on two public misinformation datasets show that HiSS prompting outperforms state-of-the-art fully-supervised approach and strong few-shot ICL-enabled baselines.
arXiv Detail & Related papers (2023-09-30T08:33:04Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Iterative Forward Tuning Boosts In-Context Learning in Language Models [88.25013390669845]
In this study, we introduce a novel two-stage framework to boost in-context learning in large language models (LLMs)
Specifically, our framework delineates the ICL process into two distinct stages: Deep-Thinking and test stages.
The Deep-Thinking stage incorporates a unique attention mechanism, i.e., iterative enhanced attention, which enables multiple rounds of information accumulation.
arXiv Detail & Related papers (2023-05-22T13:18:17Z) - ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for
Document Information Extraction [56.790794611002106]
Large language models (LLMs) have demonstrated remarkable results in various natural language processing (NLP) tasks with in-context learning.
We propose a simple but effective in-context learning framework called ICL-D3IE.
Specifically, we extract the most difficult and distinct segments from hard training documents as hard demonstrations.
arXiv Detail & Related papers (2023-03-09T06:24:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.