The Alchemy of Thought: Understanding In-Context Learning Through Supervised Classification
- URL: http://arxiv.org/abs/2601.01290v1
- Date: Sat, 03 Jan 2026 21:33:12 GMT
- Title: The Alchemy of Thought: Understanding In-Context Learning Through Supervised Classification
- Authors: Harshita Narnoli, Mihai Surdeanu,
- Abstract summary: In this paper, we compare the behavior of in-context learning with supervised classifiers trained on ICL demonstrations.<n>We observe that LLMs behave similarly to these classifiers when the relevance of demonstrations is high.
- Score: 19.524454103388553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-context learning (ICL) has become a prominent paradigm to rapidly customize LLMs to new tasks without fine-tuning. However, despite the empirical evidence of its usefulness, we still do not truly understand how ICL works. In this paper, we compare the behavior of in-context learning with supervised classifiers trained on ICL demonstrations to investigate three research questions: (1) Do LLMs with ICL behave similarly to classifiers trained on the same examples? (2) If so, which classifiers are closer, those based on gradient descent (GD) or those based on k-nearest neighbors (kNN)? (3) When they do not behave similarly, what conditions are associated with differences in behavior? Using text classification as a use case, with six datasets and three LLMs, we observe that LLMs behave similarly to these classifiers when the relevance of demonstrations is high. On average, ICL is closer to kNN than logistic regression, giving empirical evidence that the attention mechanism behaves more similarly to kNN than GD. However, when demonstration relevance is low, LLMs perform better than these classifiers, likely because LLMs can back off to their parametric memory, a luxury these classifiers do not have.
Related papers
- Verifying Large Language Models' Reasoning Paths via Correlation Matrix Rank [71.09032766271493]
Large language models (LLMs) are prone to errors and hallucinations.<n>How to check their outputs effectively and efficiently has become a critical problem in their applications.
arXiv Detail & Related papers (2025-10-28T11:01:10Z) - Supervised Fine-Tuning or Contrastive Learning? Towards Better Multimodal LLM Reranking [56.46309219272326]
For large language models (LLMs), classification via supervised fine-tuning (SFT) predicts ''yes'' (resp. ''no'') token for relevant (resp. irrelevant) pairs.<n>This divergence raises a central question: which objective is intrinsically better suited to LLM-based reranking, and what mechanism underlies the difference?<n>We conduct a comprehensive comparison and analysis between CL and SFT for reranking, taking the universal multimodal retrieval (UMR) as the experimental playground.
arXiv Detail & Related papers (2025-10-16T16:02:27Z) - In a Few Words: Comparing Weak Supervision and LLMs for Short Query Intent Classification [4.037445459586932]
We empirically compare user intent classification into informational, navigational, and transactional categories.<n>Our results indicate that while LLMs outperform weak supervision in recall, they continue to struggle with precision.
arXiv Detail & Related papers (2025-04-30T07:54:04Z) - Computation Mechanism Behind LLM Position Generalization [59.013857707250814]
Large language models (LLMs) exhibit flexibility in handling textual positions.<n>They can understand texts with position perturbations and generalize to longer texts.<n>This work connects the linguistic phenomenon with LLMs' computational mechanisms.
arXiv Detail & Related papers (2025-03-17T15:47:37Z) - Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks [54.153914606302486]
In-context learning (ICL) ability has emerged with the increasing scale of large language models (LLMs)
We propose a new paradigm called Hint-enhanced In-Context Learning (HICL) to explore the power of ICL in open-domain question answering.
arXiv Detail & Related papers (2023-11-03T14:39:20Z) - Do pretrained Transformers Learn In-Context by Gradient Descent? [21.23795112800977]
In this paper, we investigate the emergence of In-Context Learning (ICL) in language models pre-trained on natural data (LLaMa-7B)
We find that ICL and Gradient Descent (GD) modify the output distribution of language models differently.
These results indicate that emphthe equivalence between ICL and GD remains an open hypothesis and calls for further studies.
arXiv Detail & Related papers (2023-10-12T17:32:09Z) - Investigating the Learning Behaviour of In-context Learning: A
Comparison with Supervised Learning [67.25698169440818]
Large language models (LLMs) have shown remarkable capacity for in-context learning (ICL)
We train the same LLMs with the same demonstration examples via ICL and supervised learning (SL), respectively, and investigate their performance under label perturbations.
First, we find that gold labels have significant impacts on the downstream in-context performance, especially for large language models.
Second, when comparing with SL, we show empirically that ICL is less sensitive to label perturbations than SL, and ICL gradually attains comparable performance to SL as the model size increases.
arXiv Detail & Related papers (2023-07-28T09:03:19Z) - Understanding Emergent In-Context Learning from a Kernel Regression Perspective [55.95455089638838]
Large language models (LLMs) have initiated a paradigm shift in transfer learning.<n>This paper proposes a kernel-regression perspective of understanding LLMs' ICL bahaviors when faced with in-context examples.<n>We find that during ICL, the attention and hidden features in LLMs match the behaviors of a kernel regression.
arXiv Detail & Related papers (2023-05-22T06:45:02Z) - What In-Context Learning "Learns" In-Context: Disentangling Task
Recognition and Task Learning [24.395288160951118]
Large language models (LLMs) exploit in-context learning (ICL) to solve tasks with only a few demonstrations.
We characterize two ways through which ICL leverages demonstrations.
We show that models can achieve non-trivial performance with only TR, and TR does not further improve with larger models or more demonstrations.
arXiv Detail & Related papers (2023-05-16T18:05:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.