Investigating the Learning Behaviour of In-context Learning: A
Comparison with Supervised Learning
- URL: http://arxiv.org/abs/2307.15411v2
- Date: Tue, 1 Aug 2023 16:04:09 GMT
- Title: Investigating the Learning Behaviour of In-context Learning: A
Comparison with Supervised Learning
- Authors: Xindi Wang, Yufei Wang, Can Xu, Xiubo Geng, Bowen Zhang, Chongyang
Tao, Frank Rudzicz, Robert E. Mercer and Daxin Jiang
- Abstract summary: Large language models (LLMs) have shown remarkable capacity for in-context learning (ICL)
We train the same LLMs with the same demonstration examples via ICL and supervised learning (SL), respectively, and investigate their performance under label perturbations.
First, we find that gold labels have significant impacts on the downstream in-context performance, especially for large language models.
Second, when comparing with SL, we show empirically that ICL is less sensitive to label perturbations than SL, and ICL gradually attains comparable performance to SL as the model size increases.
- Score: 67.25698169440818
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have shown remarkable capacity for in-context
learning (ICL), where learning a new task from just a few training examples is
done without being explicitly pre-trained. However, despite the success of
LLMs, there has been little understanding of how ICL learns the knowledge from
the given prompts. In this paper, to make progress toward understanding the
learning behaviour of ICL, we train the same LLMs with the same demonstration
examples via ICL and supervised learning (SL), respectively, and investigate
their performance under label perturbations (i.e., noisy labels and label
imbalance) on a range of classification tasks. First, via extensive
experiments, we find that gold labels have significant impacts on the
downstream in-context performance, especially for large language models;
however, imbalanced labels matter little to ICL across all model sizes. Second,
when comparing with SL, we show empirically that ICL is less sensitive to label
perturbations than SL, and ICL gradually attains comparable performance to SL
as the model size increases.
Related papers
- Focused Large Language Models are Stable Many-Shot Learners [18.783939647966776]
In-Context Learning (ICL) enables large language models (LLMs) to achieve rapid task adaptation by learning from demonstrations.
We propose a training-free method FocusICL, which conducts triviality filtering to avoid attention being diverted by unimportant contents.
We show that FocusICL achieves an average performance improvement of 5.2% over vanilla ICL and scales well with many-shot demonstrations.
arXiv Detail & Related papers (2024-08-26T02:53:24Z) - Memorization in In-Context Learning [42.218016081867376]
In-context learning (ICL) has proven to be an effective strategy for improving the performance of large language models (LLMs) with no additional training.
This study is the first to show how ICL surfaces memorized training data and to explore the correlation between this memorization and performance.
arXiv Detail & Related papers (2024-08-21T11:54:22Z) - ICLEval: Evaluating In-Context Learning Ability of Large Language Models [68.7494310749199]
In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs.
Existing evaluation frameworks primarily focus on language abilities and knowledge, often overlooking the assessment of ICL ability.
We introduce the ICLEval benchmark to evaluate the ICL abilities of LLMs, which encompasses two key sub-abilities: exact copying and rule learning.
arXiv Detail & Related papers (2024-06-21T08:06:10Z) - Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning [99.05401042153214]
In-context learning (ICL) is potentially attributed to two major abilities: task recognition (TR) and task learning (TL)
We take the first step by examining the pre-training dynamics of the emergence of ICL.
We propose a simple yet effective method to better integrate these two abilities for ICL at inference time.
arXiv Detail & Related papers (2024-06-20T06:37:47Z) - Does In-Context Learning Really Learn? Rethinking How Large Language Models Respond and Solve Tasks via In-Context Learning [41.606494950216764]
In-context Learning (ICL) has emerged as a powerful capability alongside the development of scaled-up large language models (LLMs)
This paper decomposes the overall performance of ICL into three dimensions, label space, format, and discrimination.
We show that ICL exhibits significant efficacy in regulating the label space and format, which helps LLMs respond to desired label words.
arXiv Detail & Related papers (2024-04-11T08:20:10Z) - The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition [74.04775677110179]
In-context Learning (ICL) has emerged as a powerful paradigm for performing natural language tasks with Large Language Models (LLM)
We show that LLMs have strong yet inconsistent priors in emotion recognition that ossify their predictions.
Our results suggest that caution is needed when using ICL with larger LLMs for affect-centered tasks outside their pre-training domain.
arXiv Detail & Related papers (2024-03-25T19:07:32Z) - In-Context Learning Learns Label Relationships but Is Not Conventional
Learning [60.891931501449726]
There is currently no consensus about how in-context learning (ICL) ability of Large Language Models works.
We provide novel insights into how ICL leverages label information, revealing both capabilities and limitations.
Our experiments show that ICL predictions almost always depend on in-context labels and that ICL can learn truly novel tasks in-context.
arXiv Detail & Related papers (2023-07-23T16:54:41Z) - What In-Context Learning "Learns" In-Context: Disentangling Task
Recognition and Task Learning [24.395288160951118]
Large language models (LLMs) exploit in-context learning (ICL) to solve tasks with only a few demonstrations.
We characterize two ways through which ICL leverages demonstrations.
We show that models can achieve non-trivial performance with only TR, and TR does not further improve with larger models or more demonstrations.
arXiv Detail & Related papers (2023-05-16T18:05:19Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.