Related papers: A Survey on In-context Learning

A Survey on In-context Learning

URL: http://arxiv.org/abs/2301.00234v4
Date: Tue, 18 Jun 2024 04:19:31 GMT
Title: A Survey on In-context Learning
Authors: Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Baobao Chang, Xu Sun, Lei Li, Zhifang Sui,
Abstract summary: In-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP) We first present a formal definition of ICL and clarify its correlation to related studies. We then organize and discuss advanced techniques, including training strategies, prompt designing strategies, and related analysis.
Score: 75.41718234460895
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the increasing capabilities of large language models (LLMs), in-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP), where LLMs make predictions based on contexts augmented with a few examples. It has been a significant trend to explore ICL to evaluate and extrapolate the ability of LLMs. In this paper, we aim to survey and summarize the progress and challenges of ICL. We first present a formal definition of ICL and clarify its correlation to related studies. Then, we organize and discuss advanced techniques, including training strategies, prompt designing strategies, and related analysis. Additionally, we explore various ICL application scenarios, such as data engineering and knowledge updating. Finally, we address the challenges of ICL and suggest potential directions for further research. We hope that our work can encourage more research on uncovering how ICL works and improving ICL.

Related papers

Multimodal Contrastive In-Context Learning [0.9120312014267044]
This paper introduces a novel multimodal contrastive in-context learning framework to enhance our understanding of gradient-free in-context learning (ICL) in Large Language Models (LLMs) First, we present a contrastive learning-based interpretation of ICL in real-world settings, marking the distance of the key-value representation as the differentiator in ICL. Second, we develop an analytical framework to address biases in multimodal input formatting for real-world datasets. Third, we propose an on-the-fly approach for ICL that demonstrates effectiveness in detecting hateful memes.
arXiv Detail & Related papers (2024-08-23T10:10:01Z)
ICLEval: Evaluating In-Context Learning Ability of Large Language Models [68.7494310749199]
In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs. Existing evaluation frameworks primarily focus on language abilities and knowledge, often overlooking the assessment of ICL ability. We introduce the ICLEval benchmark to evaluate the ICL abilities of LLMs, which encompasses two key sub-abilities: exact copying and rule learning.
arXiv Detail & Related papers (2024-06-21T08:06:10Z)
Let's Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning [9.660673938961416]
Demonstration ordering is an important strategy for in-context learning (ICL) We propose a simple but effective demonstration ordering method for ICL, named the few-shot In-Context Curriculum Learning (ICCL)
arXiv Detail & Related papers (2024-02-16T14:55:33Z)
Data Poisoning for In-context Learning [49.77204165250528]
In-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks. This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks. We introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL.
arXiv Detail & Related papers (2024-02-03T14:20:20Z)
In-Context Exemplars as Clues to Retrieving from Large Associative Memory [1.2952137350423816]
In-context learning (ICL) enables large language models (LLMs) to learn patterns from in-context exemplars without training. How to choose exemplars remains unclear due to the lack of understanding of how in-context learning works. Our study sheds new light on the mechanism of ICL by connecting it to memory retrieval.
arXiv Detail & Related papers (2023-11-06T20:13:29Z)
Knowledgeable In-Context Tuning: Exploring and Exploiting Factual Knowledge for In-Context Learning [37.22349652230841]
Large language models (LLMs) enable in-context learning (ICL) by conditioning on a few labeled training examples as a text-based prompt. In this paper, we demonstrate that factual knowledge is imperative for the performance of ICL in three core facets. We introduce a novel Knowledgeable In-Context Tuning (KICT) framework to further improve the performance of ICL.
arXiv Detail & Related papers (2023-09-26T09:06:39Z)
Instruction Tuning for Large Language Models: A Survey [52.86322823501338]
We make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and applications. We also review the potential pitfalls of IT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research.
arXiv Detail & Related papers (2023-08-21T15:35:16Z)
In-Context Learning Learns Label Relationships but Is Not Conventional Learning [60.891931501449726]
There is currently no consensus about how in-context learning (ICL) ability of Large Language Models works. We provide novel insights into how ICL leverages label information, revealing both capabilities and limitations. Our experiments show that ICL predictions almost always depend on in-context labels and that ICL can learn truly novel tasks in-context.
arXiv Detail & Related papers (2023-07-23T16:54:41Z)
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning [77.7070536959126]
In-context learning (ICL) emerges as a promising capability of large language models (LLMs) In this paper, we investigate the working mechanism of ICL through an information flow lens. We introduce an anchor re-weighting method to improve ICL performance, a demonstration compression technique to expedite inference, and an analysis framework for diagnosing ICL errors in GPT2-XL.
arXiv Detail & Related papers (2023-05-23T15:26:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.