Self-ICL: Zero-Shot In-Context Learning with Self-Generated
Demonstrations
- URL: http://arxiv.org/abs/2305.15035v2
- Date: Mon, 23 Oct 2023 14:50:57 GMT
- Title: Self-ICL: Zero-Shot In-Context Learning with Self-Generated
Demonstrations
- Authors: Wei-Lin Chen, Cheng-Kuang Wu, Yun-Nung Chen, Hsin-Hsi Chen
- Abstract summary: Self-ICL is a framework which bootstraps LMs' intrinsic capabilities to perform zero-shot ICL.
Self-ICL outperforms zero-shot baselines on both average accuracy and head-to-head comparison.
- Score: 38.4166247280112
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have exhibited striking in-context learning
(ICL) ability to adapt to target tasks with a few input-output demonstrations.
For better ICL, different methods are proposed to select representative
demonstrations from existing training corpora. However, such settings are not
aligned with real-world practices, as end-users usually query LMs without
access to demonstration pools. In this work, we introduce Self-ICL -- a simple
framework which bootstraps LMs' intrinsic capabilities to perform zero-shot
ICL. Given a test input, Self-ICL first prompts the model to generate
pseudo-inputs. Next, the model predicts pseudo-labels for the pseudo-inputs via
zero-shot prompting. Finally, we perform ICL for the test input with the
pseudo-input-label pairs as demonstrations. Evaluation on 23 BIG-Bench Hard
tasks shows Self-ICL outperforms zero-shot baselines on both average accuracy
and head-to-head comparison. Moreover, with zero-shot chain-of-thought,
Self-ICL achieves results comparable to using real demonstrations.
Additionally, we conduct a range of analyses to validate Self-ICL's
effectiveness and provide insights for its behaviors under different settings.
Related papers
- Unifying Demonstration Selection and Compression for In-Context Learning [14.545490629324295]
We propose an ICL framework UniICL, which Unifies demonstration selection and compression, and final response generation via a single frozen LLM.
UniICL is a parameter-efficient framework that only contains 17M trainable parameters originating from the projection layer.
arXiv Detail & Related papers (2024-05-27T11:31:58Z) - ParaICL: Towards Robust Parallel In-Context Learning [74.38022919598443]
Large language models (LLMs) have become the norm in natural language processing.
Few-shot in-context learning (ICL) relies on the choice of few-shot demonstration examples.
We propose a novel method named parallel in-context learning (ParaICL)
arXiv Detail & Related papers (2024-03-31T05:56:15Z) - Improving Input-label Mapping with Demonstration Replay for In-context
Learning [67.57288926736923]
In-context learning (ICL) is an emerging capability of large autoregressive language models.
We propose a novel ICL method called Sliding Causal Attention (RdSca)
We show that our method significantly improves the input-label mapping in ICL demonstrations.
arXiv Detail & Related papers (2023-10-30T14:29:41Z) - Dynamic Demonstrations Controller for In-Context Learning [51.3439660534631]
In-Context Learning (ICL) is a new paradigm for natural language processing (NLP), where a large language model observes a small number of demonstrations and a test instance as its input.
Previous studies have revealed that ICL is sensitive to the selection and the ordering of demonstrations.
We propose a Dynamic Demonstrations Controller (D$2$Controller), which can improve the ICL performance by adjusting the number of demonstrations.
arXiv Detail & Related papers (2023-09-30T14:04:22Z) - Explaining Emergent In-Context Learning as Kernel Regression [61.57151500616111]
Large language models (LLMs) have initiated a paradigm shift in transfer learning.
In this paper, we investigate the reason why a transformer-based language model can accomplish in-context learning after pre-training.
We find that during ICL, the attention and hidden features in LLMs match the behaviors of a kernel regression.
arXiv Detail & Related papers (2023-05-22T06:45:02Z) - Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations [97.41375480696972]
We introduce Z-ICL, a new zero-shot method that closes the gap by constructing pseudo-demonstrations for a given test input.
evaluation on nine classification datasets shows that Z-ICL outperforms previous zero-shot methods by a significant margin.
arXiv Detail & Related papers (2022-12-19T21:34:26Z) - Self-Generated In-Context Learning: Leveraging Auto-regressive Language
Models as a Demonstration Generator [22.532627423361177]
Self-generated in-context learning (SG-ICL) generates demonstrations for in-context learning from PLM itself.
We show SG-ICL significantly outperforms zero-shot learning and is generally worth approximately 0.6 gold training samples.
arXiv Detail & Related papers (2022-06-16T10:52:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.