Related papers: Distilling Many-Shot In-Context Learning into a Cheat Sheet

Distilling Many-Shot In-Context Learning into a Cheat Sheet

URL: http://arxiv.org/abs/2509.20820v1
Date: Thu, 25 Sep 2025 07:07:46 GMT
Title: Distilling Many-Shot In-Context Learning into a Cheat Sheet
Authors: Ukyo Honda, Soichiro Murakami, Peinan Zhang,
Abstract summary: We propose cheat-sheet ICL, which distills the information from many-shot ICL into a concise textual summary (cheat sheet) used as the context at inference time.<n> Experiments on challenging reasoning tasks show that cheat-sheet ICL achieves comparable or better performance than many-shot ICL with far fewer tokens, and matches retrieval-based ICL without requiring test-time retrieval.
Score: 14.147877327632607
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in large language models (LLMs) enable effective in-context learning (ICL) with many-shot examples, but at the cost of high computational demand due to longer input tokens. To address this, we propose cheat-sheet ICL, which distills the information from many-shot ICL into a concise textual summary (cheat sheet) used as the context at inference time. Experiments on challenging reasoning tasks show that cheat-sheet ICL achieves comparable or better performance than many-shot ICL with far fewer tokens, and matches retrieval-based ICL without requiring test-time retrieval. These findings demonstrate that cheat-sheet ICL is a practical alternative for leveraging LLMs in downstream tasks.

Related papers

Who Gets Cited Most? Benchmarking Long-Context Language Models on Scientific Articles [81.89404347890662]
SciTrek is a novel question-answering benchmark designed to evaluate the long-context reasoning capabilities of large language models (LLMs) using scientific articles.<n>Our analysis reveals systematic shortcomings in models' abilities to perform basic numerical operations and accurately locate specific information in long contexts.
arXiv Detail & Related papers (2025-09-25T11:36:09Z)
MAPLE: Many-Shot Adaptive Pseudo-Labeling for In-Context Learning [53.02571749383208]
In-Context Learning (ICL) empowers Large Language Models (LLMs) to tackle diverse tasks by incorporating multiple input-output examples.<n>Many-Shot Adaptive Pseudo-LabEling (MAPLE) is a novel influence-based many-shot ICL framework that utilizes pseudo-labeled samples to compensate for the lack of label information.
arXiv Detail & Related papers (2025-05-22T04:54:27Z)
In-context Continual Learning Assisted by an External Continual Learner [19.382196203113836]
Existing continual learning (CL) methods rely on fine-tuning or adapting large language models (LLMs)<n>We introduce InCA, a novel approach that integrates an external continual learner (ECL) with ICL to enable scalable CL without CF.
arXiv Detail & Related papers (2024-12-20T04:44:41Z)
On Many-Shot In-Context Learning for Long-Context Evaluation [10.500629810624769]
This paper delves into long-context language model evaluation through many-shot ICL.<n>We develop metrics to categorize ICL tasks into two groups: similar-sample learning (SSL) and all-sample learning (ASL)<n>We find that while state-of-the-art models demonstrate good performance up to 64k tokens in SSL tasks, many models experience significant performance drops at only 16k tokens in ASL tasks.
arXiv Detail & Related papers (2024-11-11T17:00:59Z)
Implicit In-context Learning [37.0562059811099]
We introduce Implicit In-context Learning (I2CL), an innovative paradigm that reduces the inference cost of ICL to that of zero-shot learning with minimal information loss.<n>I2CL achieves few-shot level performance at zero-shot inference cost, and it exhibits robustness against variations in demonstration examples.
arXiv Detail & Related papers (2024-05-23T14:57:52Z)
Many-Shot In-Context Learning [58.395589302800566]
Large language models (LLMs) excel at few-shot in-context learning (ICL) We observe significant performance gains across a wide variety of generative and discriminative tasks. Unlike few-shot learning, many-shot learning is effective at overriding pretraining biases.
arXiv Detail & Related papers (2024-04-17T02:49:26Z)
ParaICL: Towards Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.<n>Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.<n>We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z)
Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning [27.729189318779603]
Batch-ICL is an effective, efficient, and order-agnostic inference algorithm for in-context learning. We show that Batch-ICL consistently outperforms most permutations of ICL examples. We also develop a novel variant of Batch-ICL featuring multiple "epochs" of meta-optimization.
arXiv Detail & Related papers (2024-01-12T09:31:17Z)
Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks [54.153914606302486]
In-context learning (ICL) ability has emerged with the increasing scale of large language models (LLMs) We propose a new paradigm called Hint-enhanced In-Context Learning (HICL) to explore the power of ICL in open-domain question answering.
arXiv Detail & Related papers (2023-11-03T14:39:20Z)
In-Context Learning Learns Label Relationships but Is Not Conventional Learning [60.891931501449726]
There is currently no consensus about how in-context learning (ICL) ability of Large Language Models works. We provide novel insights into how ICL leverages label information, revealing both capabilities and limitations. Our experiments show that ICL predictions almost always depend on in-context labels and that ICL can learn truly novel tasks in-context.
arXiv Detail & Related papers (2023-07-23T16:54:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.