Meta-in-context learning in large language models
- URL: http://arxiv.org/abs/2305.12907v1
- Date: Mon, 22 May 2023 10:40:36 GMT
- Title: Meta-in-context learning in large language models
- Authors: Julian Coda-Forno, Marcel Binz, Zeynep Akata, Matthew Botvinick, Jane
X. Wang, Eric Schulz
- Abstract summary: In-context learning -- the ability to improve at a task after being provided with a number of demonstrations -- is seen as one of the main contributors to their success.
We coin this phenomenon meta-in-context learning.
We show that meta-in-context learning adaptively reshapes a large language model's priors over expected tasks.
- Score: 38.28912796214566
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models have shown tremendous performance in a variety of
tasks. In-context learning -- the ability to improve at a task after being
provided with a number of demonstrations -- is seen as one of the main
contributors to their success. In the present paper, we demonstrate that the
in-context learning abilities of large language models can be recursively
improved via in-context learning itself. We coin this phenomenon
meta-in-context learning. Looking at two idealized domains, a one-dimensional
regression task and a two-armed bandit task, we show that meta-in-context
learning adaptively reshapes a large language model's priors over expected
tasks. Furthermore, we find that meta-in-context learning modifies the
in-context learning strategies of such models. Finally, we extend our approach
to a benchmark of real-world regression problems where we observe competitive
performance to traditional learning algorithms. Taken together, our work
improves our understanding of in-context learning and paves the way toward
adapting large language models to the environment they are applied purely
through meta-in-context learning rather than traditional finetuning.
Related papers
- Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Fine-Tune Language Models as Multi-Modal Differential Equation Solvers [14.181842691371935]
We present a transformation of in-context operator learning into a multi-modal paradigm.
In particular, we take inspiration from the recent success of large language models, and propose using "captions" to integrate human knowledge about the operator.
arXiv Detail & Related papers (2023-08-09T16:44:25Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - The Learnability of In-Context Learning [16.182561312622315]
We propose a first-of-its-kind PAC based framework for in-context learnability.
Our framework includes an initial pretraining phase, which fits a function to the pretraining distribution.
We show that in-context learning is more about identifying the task than about learning it.
arXiv Detail & Related papers (2023-03-14T13:28:39Z) - In-context Learning Distillation: Transferring Few-shot Learning Ability
of Pre-trained Language Models [55.78264509270503]
We introduce in-context learning distillation to transfer in-context few-shot learning ability from large models to smaller models.
We perform in-context learning distillation under two different few-shot learning paradigms: Meta In-context Tuning (Meta-ICT) and Multitask In-context Tuning (Multitask-ICT)
Our experiments and analysis reveal that in-context learning objectives and language modeling objectives are complementary under the Multitask-ICT paradigm.
arXiv Detail & Related papers (2022-12-20T22:11:35Z) - Importance Weighted Policy Learning and Adaptation [89.46467771037054]
We study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning.
The framework is inspired by ideas from the probabilistic inference literature and combines robust off-policy learning with a behavior prior.
Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv Detail & Related papers (2020-09-10T14:16:58Z) - Revisiting Meta-Learning as Supervised Learning [69.2067288158133]
We aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning.
By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning.
This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning.
arXiv Detail & Related papers (2020-02-03T06:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.