The Unreliability of Explanations in Few-Shot In-Context Learning
- URL: http://arxiv.org/abs/2205.03401v1
- Date: Fri, 6 May 2022 17:57:58 GMT
- Title: The Unreliability of Explanations in Few-Shot In-Context Learning
- Authors: Xi Ye and Greg Durrett
- Abstract summary: We focus on two NLP tasks that involve reasoning over text, namely question answering and natural language inference.
We show that explanations judged as good by humans--those that are logically consistent with the input--usually indicate more accurate predictions.
We present a framework for calibrating model predictions based on the reliability of the explanations.
- Score: 50.77996380021221
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How can prompting a large language model like GPT-3 with explanations improve
in-context learning? We focus specifically on two NLP tasks that involve
reasoning over text, namely question answering and natural language inference.
Including explanations in the prompt and having the model generate them does
not consistently improve performance in the settings we study, contrary to
recent results on symbolic reasoning tasks (Nye et al., 2021; Wei et al.,
2022). Despite careful prompting, explanations generated by GPT-3 may not even
be factually grounded in the input, even on simple tasks with straightforward
extractive explanations. However, these flawed explanations can still be useful
as a way to verify GPT-3's predictions post-hoc. Through analysis in three
settings, we show that explanations judged as good by humans--those that are
logically consistent with the input and the prediction--usually indicate more
accurate predictions. Following these observations, we present a framework for
calibrating model predictions based on the reliability of the explanations. Our
framework trains calibrators using automatically extracted scores that
approximately assess the reliability of explanations, which helps improve
performance across three different datasets.
Related papers
- XForecast: Evaluating Natural Language Explanations for Time Series Forecasting [72.57427992446698]
Time series forecasting aids decision-making, especially for stakeholders who rely on accurate predictions.
Traditional explainable AI (XAI) methods, which underline feature or temporal importance, often require expert knowledge.
evaluating forecast NLEs is difficult due to the complex causal relationships in time series data.
arXiv Detail & Related papers (2024-10-18T05:16:39Z) - Can Language Models Explain Their Own Classification Behavior? [1.8177391253202122]
Large language models (LLMs) perform well at a myriad of tasks, but explaining the processes behind this performance is a challenge.
This paper investigates whether LLMs can give faithful high-level explanations of their own internal processes.
We release our dataset, ArticulateRules, which can be used to test self-explanation for LLMs trained either in-context or by finetuning.
arXiv Detail & Related papers (2024-05-13T02:31:08Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - FLamE: Few-shot Learning from Natural Language Explanations [12.496665033682202]
We present FLamE, a framework for learning from natural language explanations.
Experiments on natural language inference demonstrate effectiveness over strong baselines.
Human evaluation surprisingly reveals that the majority of generated explanations does not adequately justify classification decisions.
arXiv Detail & Related papers (2023-06-13T18:01:46Z) - Counterfactuals of Counterfactuals: a back-translation-inspired approach
to analyse counterfactual editors [3.4253416336476246]
We focus on the analysis of counterfactual, contrastive explanations.
We propose a new back translation-inspired evaluation methodology.
We show that by iteratively feeding the counterfactual to the explainer we can obtain valuable insights into the behaviour of both the predictor and the explainer models.
arXiv Detail & Related papers (2023-05-26T16:04:28Z) - Context-faithful Prompting for Large Language Models [51.194410884263135]
Large language models (LLMs) encode parametric knowledge about world facts.
Their reliance on parametric knowledge may cause them to overlook contextual cues, leading to incorrect predictions in context-sensitive NLP tasks.
We assess and enhance LLMs' contextual faithfulness in two aspects: knowledge conflict and prediction with abstention.
arXiv Detail & Related papers (2023-03-20T17:54:58Z) - Explanation Selection Using Unlabeled Data for Chain-of-Thought
Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance.
This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z) - Reframing Human-AI Collaboration for Generating Free-Text Explanations [46.29832336779188]
We consider the task of generating free-text explanations using a small number of human-written examples.
We find that crowdworkers often prefer explanations generated by GPT-3 to crowdsourced human-written explanations.
We create a pipeline that combines GPT-3 with a supervised filter that incorporates humans-in-the-loop via binary acceptability judgments.
arXiv Detail & Related papers (2021-12-16T07:31:37Z) - Teach Me to Explain: A Review of Datasets for Explainable NLP [6.256505195819595]
Explainable NLP (ExNLP) has increasingly focused on collecting human-annotated explanations.
These explanations are used downstream in three ways: as data augmentation to improve performance on a predictive task, as a loss signal to train models to produce explanations for their predictions, and as a means to evaluate the quality of model-generated explanations.
In this review, we identify three predominant classes of explanations (highlights, free-text, and structured), organize the literature on annotating each type, point to what has been learned to date, and give recommendations for collecting ExNLP datasets in the future.
arXiv Detail & Related papers (2021-02-24T04:25:01Z) - Calibrate Before Use: Improving Few-Shot Performance of Language Models [68.17016463756474]
GPT-3 can perform numerous tasks when provided a natural language prompt that contains a few training examples.
We show that this type of few-shot learning can be unstable.
The choice of prompt format, training examples, and even the order of the training examples can cause accuracy to vary from near chance to near state-of-the-art.
arXiv Detail & Related papers (2021-02-19T00:23:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.