In-Context Example Ordering Guided by Label Distributions
- URL: http://arxiv.org/abs/2402.11447v1
- Date: Sun, 18 Feb 2024 04:08:10 GMT
- Title: In-Context Example Ordering Guided by Label Distributions
- Authors: Zhichao Xu, Daniel Cohen, Bei Wang, Vivek Srikumar
- Abstract summary: We formulate in-context example ordering as an optimization problem.
Inspired by the idea of learning from label proportions, we propose two principles for in-context example ordering guided by model's probability predictions.
We demonstrate our approach outperforms the baselines by improving the classification accuracy, reducing model miscalibration, and also by selecting better in-context examples.
- Score: 34.30216341226014
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: By allowing models to predict without task-specific training, in-context
learning (ICL) with pretrained LLMs has enormous potential in NLP. However, a
number of problems persist in ICL. In particular, its performance is sensitive
to the choice and order of in-context examples. Given the same set of
in-context examples with different orderings, model performance may vary
between near random to near state-of-the-art. In this work, we formulate
in-context example ordering as an optimization problem. We examine three
problem settings that differ in the assumptions they make about what is known
about the task. Inspired by the idea of learning from label proportions, we
propose two principles for in-context example ordering guided by model's
probability predictions. We apply our proposed principles to thirteen text
classification datasets and nine different autoregressive LLMs with 700M to 13B
parameters. We demonstrate that our approach outperforms the baselines by
improving the classification accuracy, reducing model miscalibration, and also
by selecting better in-context examples.
Related papers
- Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods [69.36397993451742]
This work introduces Context-aware Prompt Tuning (CPT), a method inspired by ICL, PT, and adversarial attacks.
We modify specific context tokens, considering the unique structure of input and output formats.
Inspired by adversarial attacks, we adjust the input based on the labels present in the context, focusing on minimizing, rather than maximizing, the loss.
arXiv Detail & Related papers (2024-10-22T17:45:47Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Addressing Order Sensitivity of In-Context Demonstration Examples in Causal Language Models [18.03259038587496]
In-context learning can be significantly influenced by the order of in-context demonstration examples.
We introduce an unsupervised fine-tuning method, termed the Information-Augmented and Consistency-Enhanced approach.
Our proposed method can reduce the sensitivity of CausalLMs to the order of in-context examples and exhibit robust generalizability.
arXiv Detail & Related papers (2024-02-23T22:39:12Z) - Mind the instructions: a holistic evaluation of consistency and
interactions in prompt-based learning [14.569770617709073]
We present a detailed analysis of which design choices cause instabilities and inconsistencies in task predictions.
We show how spurious correlations between input distributions and labels form only a minor problem for prompted models.
We statistically analyse the results to show which factors are the most influential, interactive or stable.
arXiv Detail & Related papers (2023-10-20T13:25:24Z) - Towards Informative Few-Shot Prompt with Maximum Information Gain for
In-Context Learning [30.536184852029386]
Large Language models (LLMs) possess the capability to engage In-context Learning (ICL)
LLMs possess the capability to engage In-context Learning (ICL) by leveraging a few demonstrations pertaining to a new downstream task as conditions.
However, this particular learning paradigm suffers from high instability stemming from substantial variances induced by factors such as the input distribution of selected examples, their ordering, and prompt formats.
arXiv Detail & Related papers (2023-10-13T07:49:11Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z) - RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning [53.52699766206808]
We propose Retrieval for In-Context Learning (RetICL), a learnable method for modeling and optimally selecting examples sequentially for in-context learning.
We evaluate RetICL on math word problem solving and scientific question answering tasks and show that it consistently outperforms or matches and learnable baselines.
arXiv Detail & Related papers (2023-05-23T20:15:56Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Meta-learning via Language Model In-context Tuning [16.306733033119897]
The goal of meta-learning is to learn to adapt to a new task with only a few labeled examples.
We propose $textitin-context tuning, which recasts adaptation and prediction.
We benchmark our method on two collections of text classification tasks: LAMA and BinaryClfs.
arXiv Detail & Related papers (2021-10-15T02:29:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.