FLamE: Few-shot Learning from Natural Language Explanations
- URL: http://arxiv.org/abs/2306.08042v1
- Date: Tue, 13 Jun 2023 18:01:46 GMT
- Title: FLamE: Few-shot Learning from Natural Language Explanations
- Authors: Yangqiaoyu Zhou, Yiming Zhang, and Chenhao Tan
- Abstract summary: We present FLamE, a framework for learning from natural language explanations.
Experiments on natural language inference demonstrate effectiveness over strong baselines.
Human evaluation surprisingly reveals that the majority of generated explanations does not adequately justify classification decisions.
- Score: 12.496665033682202
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Natural language explanations have the potential to provide rich information
that in principle guides model reasoning. Yet, recent work by Lampinen et al.
(2022) has shown limited utility of natural language explanations in improving
classification. To effectively learn from explanations, we present FLamE, a
two-stage few-shot learning framework that first generates explanations using
GPT-3, and then finetunes a smaller model (e.g., RoBERTa) with generated
explanations. Our experiments on natural language inference demonstrate
effectiveness over strong baselines, increasing accuracy by 17.6% over GPT-3
Babbage and 5.7% over GPT-3 Davinci in e-SNLI. Despite improving classification
performance, human evaluation surprisingly reveals that the majority of
generated explanations does not adequately justify classification decisions.
Additional analyses point to the important role of label-specific cues (e.g.,
"not know" for the neutral label) in generated explanations.
Related papers
- Evaluating Human Alignment and Model Faithfulness of LLM Rationale [66.75309523854476]
We study how well large language models (LLMs) explain their generations through rationales.
We show that prompting-based methods are less "faithful" than attribution-based explanations.
arXiv Detail & Related papers (2024-06-28T20:06:30Z) - Can Language Models Explain Their Own Classification Behavior? [1.8177391253202122]
Large language models (LLMs) perform well at a myriad of tasks, but explaining the processes behind this performance is a challenge.
This paper investigates whether LLMs can give faithful high-level explanations of their own internal processes.
We release our dataset, ArticulateRules, which can be used to test self-explanation for LLMs trained either in-context or by finetuning.
arXiv Detail & Related papers (2024-05-13T02:31:08Z) - Inference to the Best Explanation in Large Language Models [6.037970847418495]
This paper proposes IBE-Eval, a framework inspired by philosophical accounts on Inference to the Best Explanation (IBE)
IBE-Eval estimates the plausibility of natural language explanations through a combination of explicit logical and linguistic features.
Experiments reveal that IBE-Eval can successfully identify the best explanation with up to 77% accuracy.
arXiv Detail & Related papers (2024-02-16T15:41:23Z) - Exploring Automatically Perturbed Natural Language Explanations in
Relation Extraction [20.02647320786556]
We find that corrupted explanations with diminished inductive biases can achieve competitive or superior performance compared to the original explanations.
Our findings furnish novel insights into the characteristics of natural language explanations.
arXiv Detail & Related papers (2023-05-24T19:17:13Z) - MaNtLE: Model-agnostic Natural Language Explainer [9.43206883360088]
We introduce MaNtLE, a model-agnostic natural language explainer that analyzes multiple classifier predictions.
MaNtLE uses multi-task training on thousands of synthetic classification tasks to generate faithful explanations.
Simulated user studies indicate that, on average, MaNtLE-generated explanations are at least 11% more faithful compared to LIME and Anchors explanations.
arXiv Detail & Related papers (2023-05-22T12:58:06Z) - Zero-Shot Classification by Logical Reasoning on Natural Language
Explanations [56.42922904777717]
We propose the framework CLORE (Classification by LOgical Reasoning on Explanations)
CLORE parses explanations into logical structures and then explicitly reasons along thess structures on the input to produce a classification score.
We also demonstrate that our framework can be extended to zero-shot classification on visual modality.
arXiv Detail & Related papers (2022-11-07T01:05:11Z) - Interpretability in the Wild: a Circuit for Indirect Object
Identification in GPT-2 small [68.879023473838]
We present an explanation for how GPT-2 small performs a natural language task called indirect object identification (IOI)
To our knowledge, this investigation is the largest end-to-end attempt at reverse-engineering a natural behavior "in the wild" in a language model.
arXiv Detail & Related papers (2022-11-01T17:08:44Z) - Explanations from Large Language Models Make Small Reasoners Better [61.991772773700006]
We show that our method can consistently and significantly outperform finetuning baselines across different settings.
As a side benefit, human evaluation shows that our method can generate high-quality explanations to justify its predictions.
arXiv Detail & Related papers (2022-10-13T04:50:02Z) - The Unreliability of Explanations in Few-Shot In-Context Learning [50.77996380021221]
We focus on two NLP tasks that involve reasoning over text, namely question answering and natural language inference.
We show that explanations judged as good by humans--those that are logically consistent with the input--usually indicate more accurate predictions.
We present a framework for calibrating model predictions based on the reliability of the explanations.
arXiv Detail & Related papers (2022-05-06T17:57:58Z) - Towards Interpretable Natural Language Understanding with Explanations
as Latent Variables [146.83882632854485]
We develop a framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training.
Our framework treats natural language explanations as latent variables that model the underlying reasoning process of a neural model.
arXiv Detail & Related papers (2020-10-24T02:05:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.