Let's Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning
- URL: http://arxiv.org/abs/2402.10738v2
- Date: Sun, 16 Jun 2024 13:26:10 GMT
- Title: Let's Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning
- Authors: Yinpeng Liu, Jiawei Liu, Xiang Shi, Qikai Cheng, Yong Huang, Wei Lu,
- Abstract summary: Demonstration ordering is an important strategy for in-context learning (ICL)
We propose a simple but effective demonstration ordering method for ICL, named the few-shot In-Context Curriculum Learning (ICCL)
- Score: 9.660673938961416
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Demonstration ordering, which is an important strategy for in-context learning (ICL), can significantly affects the performance of large language models (LLMs). However, most of the current approaches of ordering require high computational costs to introduce the priori knowledge. In this paper, inspired by the human learning process, we propose a simple but effective demonstration ordering method for ICL, named the few-shot In-Context Curriculum Learning (ICCL). The ICCL implies gradually increasing the complexity of prompt demonstrations during the inference process. The difficulty can be assessed by human experts or LLMs-driven metrics, such as perplexity. Then we design extensive experiments to discuss the effectiveness of the ICCL at both corpus-level and instance-level. Moreover, we also investigate the formation mechanism of LLM's ICCL capability. Experimental results demonstrate that ICCL, developed during the instruction-tuning stage, is effective for representative open-source LLMs. To facilitate further research and applications by other scholars, we make the code publicly available.
Related papers
- Investigating the Zone of Proximal Development of Language Models for In-Context Learning [59.91708683601029]
We introduce a learning analytics framework to analyze the in-context learning (ICL) behavior of large language models (LLMs)
We adapt the Zone of Proximal Development (ZPD) theory to ICL, measuring the ZPD of LLMs based on model performance on individual examples.
Our findings reveal a series of intricate and multifaceted behaviors of ICL, providing new insights into understanding and leveraging this technique.
arXiv Detail & Related papers (2025-02-10T19:36:21Z) - Multimodal Contrastive In-Context Learning [0.9120312014267044]
This paper introduces a novel multimodal contrastive in-context learning framework to enhance our understanding of gradient-free in-context learning (ICL) in Large Language Models (LLMs)
First, we present a contrastive learning-based interpretation of ICL in real-world settings, marking the distance of the key-value representation as the differentiator in ICL.
Second, we develop an analytical framework to address biases in multimodal input formatting for real-world datasets.
Third, we propose an on-the-fly approach for ICL that demonstrates effectiveness in detecting hateful memes.
arXiv Detail & Related papers (2024-08-23T10:10:01Z) - ICLEval: Evaluating In-Context Learning Ability of Large Language Models [68.7494310749199]
In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs.
Existing evaluation frameworks primarily focus on language abilities and knowledge, often overlooking the assessment of ICL ability.
We introduce the ICLEval benchmark to evaluate the ICL abilities of LLMs, which encompasses two key sub-abilities: exact copying and rule learning.
arXiv Detail & Related papers (2024-06-21T08:06:10Z) - Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning [99.05401042153214]
In-context learning (ICL) is potentially attributed to two major abilities: task recognition (TR) and task learning (TL)
We take the first step by examining the pre-training dynamics of the emergence of ICL.
We propose a simple yet effective method to better integrate these two abilities for ICL at inference time.
arXiv Detail & Related papers (2024-06-20T06:37:47Z) - Data Poisoning for In-context Learning [49.77204165250528]
In-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks.
This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks.
We introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL.
arXiv Detail & Related papers (2024-02-03T14:20:20Z) - Knowledgeable In-Context Tuning: Exploring and Exploiting Factual Knowledge for In-Context Learning [37.22349652230841]
Large language models (LLMs) enable in-context learning (ICL) by conditioning on a few labeled training examples as a text-based prompt.
In this paper, we demonstrate that factual knowledge is imperative for the performance of ICL in three core facets.
We introduce a novel Knowledgeable In-Context Tuning (KICT) framework to further improve the performance of ICL.
arXiv Detail & Related papers (2023-09-26T09:06:39Z) - Iterative Forward Tuning Boosts In-Context Learning in Language Models [88.25013390669845]
In this study, we introduce a novel two-stage framework to boost in-context learning in large language models (LLMs)
Specifically, our framework delineates the ICL process into two distinct stages: Deep-Thinking and test stages.
The Deep-Thinking stage incorporates a unique attention mechanism, i.e., iterative enhanced attention, which enables multiple rounds of information accumulation.
arXiv Detail & Related papers (2023-05-22T13:18:17Z) - A Survey on In-context Learning [77.78614055956365]
In-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP)
We first present a formal definition of ICL and clarify its correlation to related studies.
We then organize and discuss advanced techniques, including training strategies, prompt designing strategies, and related analysis.
arXiv Detail & Related papers (2022-12-31T15:57:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.