In-Context Probing: Toward Building Robust Classifiers via Probing Large
Language Models
- URL: http://arxiv.org/abs/2305.14171v3
- Date: Fri, 22 Dec 2023 13:27:11 GMT
- Title: In-Context Probing: Toward Building Robust Classifiers via Probing Large
Language Models
- Authors: Afra Amini and Massimiliano Ciaramita
- Abstract summary: In this paper, we propose an alternative approach, which we term In-Context Probing (ICP)
Similar to in-context learning, we contextualize the representation of the input with an instruction, but instead of decoding the output prediction, we probe the contextualized representation to predict the label.
We show that ICP performs competitive or superior to finetuning and can be particularly helpful to build classifiers on top of smaller models.
- Score: 5.5089506884366735
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models are able to learn new tasks in context, where they are
provided with instructions and a few annotated examples. However, the
effectiveness of in-context learning is dependent on the provided context, and
the performance on a downstream task can vary considerably, depending on the
instruction. Importantly, such dependency on the context can surface in
unpredictable ways, e.g., a seemingly more informative instruction might lead
to a worse performance. In this paper, we propose an alternative approach,
which we term In-Context Probing (ICP). Similar to in-context learning, we
contextualize the representation of the input with an instruction, but instead
of decoding the output prediction, we probe the contextualized representation
to predict the label. Through a series of experiments on a diverse set of
classification tasks, we show that in-context probing is significantly more
robust to changes in instructions. We further show that ICP performs
competitive or superior to finetuning and can be particularly helpful to build
classifiers on top of smaller models, with less than a hundred training
examples.
Related papers
- Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding [71.01099784480597]
Large language models (LLMs) excel at a range of tasks through in-context learning (ICL)
We introduce In-Context Contrastive Decoding (ICCD), a novel method that emphasizes input-label mapping.
ICCD emphasizes input-label mapping by contrasting the output distributions between positive and negative in-context examples.
arXiv Detail & Related papers (2025-02-19T14:04:46Z) - On the Loss of Context-awareness in General Instruction Fine-tuning [101.03941308894191]
We investigate the loss of context awareness after supervised fine-tuning.
We find that the performance decline is associated with a bias toward different roles learned during conversational instruction fine-tuning.
We propose a metric to identify context-dependent examples from general instruction fine-tuning datasets.
arXiv Detail & Related papers (2024-11-05T00:16:01Z) - Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning [32.178931149612644]
In-context learning enables language models to adapt to downstream data or incorporate tasks by few samples as demonstrations within the prompts.
However, the performance of in-context learning can be unstable depending on the quality, format, or order of demonstrations.
We propose a novel approach "vocabulary-defined semantics"
arXiv Detail & Related papers (2024-01-29T14:29:48Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning [53.52699766206808]
We propose Retrieval for In-Context Learning (RetICL), a learnable method for modeling and optimally selecting examples sequentially for in-context learning.
We evaluate RetICL on math word problem solving and scientific question answering tasks and show that it consistently outperforms or matches and learnable baselines.
arXiv Detail & Related papers (2023-05-23T20:15:56Z) - Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats.
We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes.
We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z) - Compositional Exemplars for In-context Learning [21.961094715261133]
Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability.
We propose CEIL (Compositional Exemplars for In-context Learning) to model the interaction between the given input and in-context examples.
We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing.
arXiv Detail & Related papers (2023-02-11T14:02:08Z) - Improving Few-Shot Performance of Language Models via Nearest Neighbor
Calibration [12.334422701057674]
We propose a novel nearest-neighbor calibration framework for in-context learning.
It is inspired by a phenomenon that the in-context learning paradigm produces incorrect labels when inferring training instances.
Experiments on various few-shot text classification tasks demonstrate that our method significantly improves in-context learning.
arXiv Detail & Related papers (2022-12-05T12:49:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.