On Theoretical Interpretations of Concept-Based In-Context Learning
- URL: http://arxiv.org/abs/2509.20882v2
- Date: Thu, 16 Oct 2025 05:50:52 GMT
- Title: On Theoretical Interpretations of Concept-Based In-Context Learning
- Authors: Huaze Tang, Tianren Peng, Shao-lun Huang,
- Abstract summary: In-Context Learning (ICL) has emerged as an important new paradigm in natural language processing and large language model (LLM) applications.<n>This paper aims to investigate this issue by studying a particular ICL approach, called concept-based ICL (CB-ICL)<n>In particular, we propose theoretical analyses on applying CB-ICL to ICL tasks, which explains why and when the CB-ICL performs well for predicting query labels in prompts with only a few demonstrations.
- Score: 16.80110500426294
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-Context Learning (ICL) has emerged as an important new paradigm in natural language processing and large language model (LLM) applications. However, the theoretical understanding of the ICL mechanism remains limited. This paper aims to investigate this issue by studying a particular ICL approach, called concept-based ICL (CB-ICL). In particular, we propose theoretical analyses on applying CB-ICL to ICL tasks, which explains why and when the CB-ICL performs well for predicting query labels in prompts with only a few demonstrations. In addition, the proposed theory quantifies the knowledge that can be leveraged by the LLMs to the prompt tasks, and leads to a similarity measure between the prompt demonstrations and the query input, which provides important insights and guidance for model pre-training and prompt engineering in ICL. Moreover, the impact of the prompt demonstration size and the dimension of the LLM embeddings in ICL are also explored based on the proposed theory. Finally, several real-data experiments are conducted to validate the practical usefulness of CB-ICL and the corresponding theory.
Related papers
- Forget BIT, It is All about TOKEN: Towards Semantic Information Theory for LLMs [7.26032677670473]
Large language models (LLMs) have demonstrated remarkable capabilities in numerous real-world applications.<n>How to open the black-box of LLMs from a theoretical standpoint has become a critical challenge.<n>This paper takes the theory of rate-distortion function, directed information, and Granger causality as its starting point.
arXiv Detail & Related papers (2025-11-03T03:56:34Z) - Schema for In-Context Learning [0.7850388075652649]
In-context learning (ICL) enables language models to adapt to new tasks by conditioning on demonstration examples.<n>Inspired by cognitive science, we introduce SCHEMA ACTIVATED IN CONTEXT (SA-ICL)<n>This framework extracts the representation of the building blocks of cognition for the reasoning process instilled from prior examples.<n>We show that SA-ICL consistently boosts performance, up to 36.19 percent, when the single demonstration example is of high quality.
arXiv Detail & Related papers (2025-10-14T21:00:15Z) - Illusion or Algorithm? Investigating Memorization, Emergence, and Symbolic Processing in In-Context Learning [50.53703102032562]
Large-scale Transformer language models (LMs) trained solely on next-token prediction with web-scale data can solve a wide range of tasks.<n>The mechanism behind this capability, known as in-context learning (ICL), remains both controversial and poorly understood.
arXiv Detail & Related papers (2025-05-16T08:50:42Z) - Investigating the Zone of Proximal Development of Language Models for In-Context Learning [59.91708683601029]
We introduce a learning analytics framework to analyze the in-context learning (ICL) behavior of large language models (LLMs)<n>We adapt the Zone of Proximal Development (ZPD) theory to ICL, measuring the ZPD of LLMs based on model performance on individual examples.<n>Our findings reveal a series of intricate and multifaceted behaviors of ICL, providing new insights into understanding and leveraging this technique.
arXiv Detail & Related papers (2025-02-10T19:36:21Z) - Multimodal Contrastive In-Context Learning [0.9120312014267044]
This paper introduces a novel multimodal contrastive in-context learning framework to enhance our understanding of gradient-free in-context learning (ICL) in Large Language Models (LLMs)
First, we present a contrastive learning-based interpretation of ICL in real-world settings, marking the distance of the key-value representation as the differentiator in ICL.
Second, we develop an analytical framework to address biases in multimodal input formatting for real-world datasets.
Third, we propose an on-the-fly approach for ICL that demonstrates effectiveness in detecting hateful memes.
arXiv Detail & Related papers (2024-08-23T10:10:01Z) - ICLEval: Evaluating In-Context Learning Ability of Large Language Models [68.7494310749199]
In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs.<n>Existing evaluation frameworks primarily focus on language abilities and knowledge, often overlooking the assessment of ICL ability.<n>We introduce the ICLEval benchmark to evaluate the ICL abilities of LLMs, which encompasses two key sub-abilities: exact copying and rule learning.
arXiv Detail & Related papers (2024-06-21T08:06:10Z) - TEGEE: Task dEfinition Guided Expert Ensembling for Generalizable and Few-shot Learning [37.09785060896196]
We propose textbfTEGEE (Task Definition Guided Expert Ensembling), a method that explicitly extracts task definitions.<n>Our framework employs a dual 3B model approach, with each model assigned a distinct role.<n> Empirical evaluations show that TEGEE performs comparably to the larger LLaMA2-13B model.
arXiv Detail & Related papers (2024-03-07T05:26:41Z) - Let's Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning [9.660673938961416]
Demonstration ordering is an important strategy for in-context learning (ICL)
We propose a simple but effective demonstration ordering method for ICL, named the few-shot In-Context Curriculum Learning (ICCL)
arXiv Detail & Related papers (2024-02-16T14:55:33Z) - Data Poisoning for In-context Learning [49.77204165250528]
In-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks.<n>This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks.<n>We introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL.
arXiv Detail & Related papers (2024-02-03T14:20:20Z) - A Principled Framework for Knowledge-enhanced Large Language Model [58.1536118111993]
Large Language Models (LLMs) are versatile, yet they often falter in tasks requiring deep and reliable reasoning.
This paper introduces a rigorously designed framework for creating LLMs that effectively anchor knowledge and employ a closed-loop reasoning process.
arXiv Detail & Related papers (2023-11-18T18:10:02Z) - In-Context Exemplars as Clues to Retrieving from Large Associative
Memory [1.2952137350423816]
In-context learning (ICL) enables large language models (LLMs) to learn patterns from in-context exemplars without training.
How to choose exemplars remains unclear due to the lack of understanding of how in-context learning works.
Our study sheds new light on the mechanism of ICL by connecting it to memory retrieval.
arXiv Detail & Related papers (2023-11-06T20:13:29Z) - A Survey on In-context Learning [77.78614055956365]
In-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP)
We first present a formal definition of ICL and clarify its correlation to related studies.
We then organize and discuss advanced techniques, including training strategies, prompt designing strategies, and related analysis.
arXiv Detail & Related papers (2022-12-31T15:57:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.