Self-Generated In-Context Learning: Leveraging Auto-regressive Language
Models as a Demonstration Generator
- URL: http://arxiv.org/abs/2206.08082v1
- Date: Thu, 16 Jun 2022 10:52:13 GMT
- Title: Self-Generated In-Context Learning: Leveraging Auto-regressive Language
Models as a Demonstration Generator
- Authors: Hyuhng Joon Kim, Hyunsoo Cho, Junyeob Kim, Taeuk Kim, Kang Min Yoo,
Sang-goo Lee
- Abstract summary: Self-generated in-context learning (SG-ICL) generates demonstrations for in-context learning from PLM itself.
We show SG-ICL significantly outperforms zero-shot learning and is generally worth approximately 0.6 gold training samples.
- Score: 22.532627423361177
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale pre-trained language models (PLMs) are well-known for being
capable of solving a task simply by conditioning a few input-label pairs dubbed
demonstrations on a prompt without being explicitly tuned for the desired
downstream task. Such a process (i.e., in-context learning), however, naturally
leads to high reliance on the demonstrations which are usually selected from
external datasets. In this paper, we propose self-generated in-context learning
(SG-ICL), which generates demonstrations for in-context learning from PLM
itself to minimize the reliance on the external demonstration. We conduct
experiments on four different text classification tasks and show SG-ICL
significantly outperforms zero-shot learning and is generally worth
approximately 0.6 gold training samples. Moreover, our generated demonstrations
show more consistent performance with low variance compared to randomly
selected demonstrations from the training dataset.
Related papers
- Show, Don't Tell: Aligning Language Models with Demonstrated Feedback [54.10302745921713]
Demonstration ITerated Task Optimization (DITTO) directly aligns language model outputs to a user's demonstrated behaviors.
We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts.
arXiv Detail & Related papers (2024-06-02T23:13:56Z) - Revisiting Demonstration Selection Strategies in In-Context Learning [66.11652803887284]
Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL)
In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent.
We propose a data- and model-dependent demonstration selection method, textbfTopK + ConE, based on the assumption that textitthe performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples.
arXiv Detail & Related papers (2024-01-22T16:25:27Z) - In-context Learning with Retrieved Demonstrations for Language Models: A Survey [23.24271704145876]
Few-shot in-context learners (ICL) are adept at adapting to new tasks with just a few demonstrations in the input context.
Instead of using a fixed set of demonstrations, one recent development is to retrieve demonstrations tailored to each input query.
We discuss and compare different design choices for retrieval models, retrieval training procedures, and inference algorithms.
arXiv Detail & Related papers (2024-01-21T23:34:42Z) - Dynamic Demonstrations Controller for In-Context Learning [51.3439660534631]
In-Context Learning (ICL) is a new paradigm for natural language processing (NLP), where a large language model observes a small number of demonstrations and a test instance as its input.
Previous studies have revealed that ICL is sensitive to the selection and the ordering of demonstrations.
We propose a Dynamic Demonstrations Controller (D$2$Controller), which can improve the ICL performance by adjusting the number of demonstrations.
arXiv Detail & Related papers (2023-09-30T14:04:22Z) - Self-ICL: Zero-Shot In-Context Learning with Self-Generated
Demonstrations [38.4166247280112]
Self-ICL is a framework which bootstraps LMs' intrinsic capabilities to perform zero-shot ICL.
Self-ICL outperforms zero-shot baselines on both average accuracy and head-to-head comparison.
arXiv Detail & Related papers (2023-05-24T11:22:34Z) - In-Context Demonstration Selection with Cross Entropy Difference [95.21947716378641]
Large language models (LLMs) can use in-context demonstrations to improve performance on zero-shot tasks.
We present a cross-entropy difference (CED) method for selecting in-context demonstrations.
arXiv Detail & Related papers (2023-05-24T05:04:00Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Dr.ICL: Demonstration-Retrieved In-context Learning [29.142262267850704]
In-context learning (ICL) teaching a large language model to perform a task with few-shot demonstrations has emerged as a strong paradigm for using LLMs.
Recent research suggests that retrieving semantically similar demonstrations to the input from a pool of available demonstrations results in better performance.
This work expands the applicability of retrieval-based ICL approaches by demonstrating that even simple word-overlap similarity measures such as BM25 outperform randomly selected demonstrations.
arXiv Detail & Related papers (2023-05-23T14:55:25Z) - Robustness of Demonstration-based Learning Under Limited Data Scenario [54.912936555876826]
Demonstration-based learning has shown great potential in stimulating pretrained language models' ability under limited data scenario.
Why such demonstrations are beneficial for the learning process remains unclear since there is no explicit alignment between the demonstrations and the predictions.
In this paper, we design pathological demonstrations by gradually removing intuitively useful information from the standard ones to take a deep dive of the robustness of demonstration-based sequence labeling.
arXiv Detail & Related papers (2022-10-19T16:15:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.