Large Language Models Can Be Easily Distracted by Irrelevant Context
- URL: http://arxiv.org/abs/2302.00093v3
- Date: Tue, 6 Jun 2023 08:36:20 GMT
- Title: Large Language Models Can Be Easily Distracted by Irrelevant Context
- Authors: Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed
Chi, Nathanael Sch\"arli, Denny Zhou
- Abstract summary: We investigate how the model problem-solving accuracy can be influenced by irrelevant context.
We use benchmark to measure the distractibility of cutting-edge prompting techniques for large language models.
- Score: 29.315230178997002
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models have achieved impressive performance on various natural
language processing tasks. However, so far they have been evaluated primarily
on benchmarks where all information in the input context is relevant for
solving the task. In this work, we investigate the distractibility of large
language models, i.e., how the model problem-solving accuracy can be influenced
by irrelevant context. In particular, we introduce Grade-School Math with
Irrelevant Context (GSM-IC), an arithmetic reasoning dataset with irrelevant
information in the problem description. We use this benchmark to measure the
distractibility of cutting-edge prompting techniques for large language models,
and find that the model performance is dramatically decreased when irrelevant
information is included. We also identify several approaches for mitigating
this deficiency, such as decoding with self-consistency and adding to the
prompt an instruction that tells the language model to ignore the irrelevant
information.
Related papers
- Is Less More? Quality, Quantity and Context in Idiom Processing with Natural Language Models [1.7563410381304443]
We present datasets created by substitution of synonyms of potentially idiomatic English noun compounds in public domain book texts.
Performance on an idiomaticity detection task indicates that dataset quality is a stronger factor for context-enriched models, but that quantity also plays a role in models without context inclusion strategies.
arXiv Detail & Related papers (2024-05-14T10:54:20Z) - Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - Making Retrieval-Augmented Language Models Robust to Irrelevant Context [55.564789967211844]
An important desideratum of RALMs, is that retrieved information helps model performance when it is relevant.
Recent work has shown that retrieval augmentation can sometimes have a negative effect on performance.
arXiv Detail & Related papers (2023-10-02T18:52:35Z) - RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models [57.12888828853409]
RAVEN is a model that combines retrieval-augmented masked language modeling and prefix language modeling.
Fusion-in-Context Learning enables the model to leverage more in-context examples without requiring additional training.
Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning.
arXiv Detail & Related papers (2023-08-15T17:59:18Z) - Lost in the Middle: How Language Models Use Long Contexts [88.78803442320246]
We analyze the performance of language models on two tasks that require identifying relevant information in their input contexts.
We find that performance can degrade significantly when changing the position of relevant information.
Our analysis provides a better understanding of how language models use their input context and provides new evaluation protocols for future long-context language models.
arXiv Detail & Related papers (2023-07-06T17:54:11Z) - Are Large Language Models Robust Coreference Resolvers? [17.60248310475889]
We show that prompting for coreference can outperform current unsupervised coreference systems.
Further investigations reveal that instruction-tuned LMs generalize surprisingly well across domains, languages, and time periods.
arXiv Detail & Related papers (2023-05-23T19:38:28Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - CausaLM: Causal Model Explanation Through Counterfactual Language Models [33.29636213961804]
CausaLM is a framework for producing causal model explanations using counterfactual language representation models.
We show that language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest.
A byproduct of our method is a language representation model that is unaffected by the tested concept.
arXiv Detail & Related papers (2020-05-27T15:06:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.