Do Natural Language Explanations Represent Valid Logical Arguments?
Verifying Entailment in Explainable NLI Gold Standards
- URL: http://arxiv.org/abs/2105.01974v1
- Date: Wed, 5 May 2021 10:59:26 GMT
- Title: Do Natural Language Explanations Represent Valid Logical Arguments?
Verifying Entailment in Explainable NLI Gold Standards
- Authors: Marco Valentino, Ian Pratt-Hartman, Andr\'e Freitas
- Abstract summary: An emerging line of research in Explainable NLP is the creation of datasets enriched with human-annotated explanations and rationales.
While human-annotated explanations are used as ground-truth for the inference, there is a lack of systematic assessment of their consistency and rigour.
We propose a systematic annotation methodology, named Explanation Entailment Verification (EEV), to quantify the logical validity of human-annotated explanations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An emerging line of research in Explainable NLP is the creation of datasets
enriched with human-annotated explanations and rationales, used to build and
evaluate models with step-wise inference and explanation generation
capabilities. While human-annotated explanations are used as ground-truth for
the inference, there is a lack of systematic assessment of their consistency
and rigour. In an attempt to provide a critical quality assessment of
Explanation Gold Standards (XGSs) for NLI, we propose a systematic annotation
methodology, named Explanation Entailment Verification (EEV), to quantify the
logical validity of human-annotated explanations. The application of EEV on
three mainstream datasets reveals the surprising conclusion that a majority of
the explanations, while appearing coherent on the surface, represent logically
invalid arguments, ranging from being incomplete to containing clearly
identifiable logical errors. This conclusion confirms that the inferential
properties of explanations are still poorly formalised and understood, and that
additional work on this line of research is necessary to improve the way
Explanation Gold Standards are constructed.
Related papers
- Reasoning with Natural Language Explanations [15.281385727331473]
Explanation constitutes an archetypal feature of human rationality, underpinning learning and generalisation.
An increasing amount of research in Natural Language Inference (NLI) has started reconsidering the role that explanations play in learning and inference.
arXiv Detail & Related papers (2024-10-05T13:15:24Z) - Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving [13.485604499678262]
This paper investigates the verification and refinement of natural language explanations through the integration of Large Language Models (LLMs) and Theorem Provers (TPs)
We present a neuro-symbolic framework, named Explanation-Refiner, that integrates TPs with LLMs to generate and formalise explanatory sentences.
In turn, the TP is employed to provide formal guarantees on the logical validity of the explanations and to generate feedback for subsequent improvements.
arXiv Detail & Related papers (2024-05-02T15:20:01Z) - Inference to the Best Explanation in Large Language Models [6.037970847418495]
This paper proposes IBE-Eval, a framework inspired by philosophical accounts on Inference to the Best Explanation (IBE)
IBE-Eval estimates the plausibility of natural language explanations through a combination of explicit logical and linguistic features.
Experiments reveal that IBE-Eval can successfully identify the best explanation with up to 77% accuracy.
arXiv Detail & Related papers (2024-02-16T15:41:23Z) - Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement [92.61557711360652]
Language models (LMs) often fall short on inductive reasoning, despite achieving impressive success on research benchmarks.
We conduct a systematic study of the inductive reasoning capabilities of LMs through iterative hypothesis refinement.
We reveal several discrepancies between the inductive reasoning processes of LMs and humans, shedding light on both the potentials and limitations of using LMs in inductive reasoning tasks.
arXiv Detail & Related papers (2023-10-12T17:51:10Z) - MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure [129.8481568648651]
We propose a benchmark to investigate models' logical reasoning capabilities in complex real-life scenarios.
Based on the multi-hop chain of reasoning, the explanation form includes three main components.
We evaluate the current best models' performance on this new explanation form.
arXiv Detail & Related papers (2022-10-22T16:01:13Z) - RES: A Robust Framework for Guiding Visual Explanation [8.835733039270364]
We propose a framework for guiding visual explanation by developing a novel objective that handles inaccurate boundary, incomplete region, and inconsistent distribution of human annotations.
Experiments on two real-world image datasets demonstrate the effectiveness of the proposed framework on enhancing both the reasonability of the explanation and the performance of the backbones model.
arXiv Detail & Related papers (2022-06-27T16:06:27Z) - Logical Satisfiability of Counterfactuals for Faithful Explanations in
NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals.
It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation.
It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z) - The Unreliability of Explanations in Few-Shot In-Context Learning [50.77996380021221]
We focus on two NLP tasks that involve reasoning over text, namely question answering and natural language inference.
We show that explanations judged as good by humans--those that are logically consistent with the input--usually indicate more accurate predictions.
We present a framework for calibrating model predictions based on the reliability of the explanations.
arXiv Detail & Related papers (2022-05-06T17:57:58Z) - Explainability in Process Outcome Prediction: Guidelines to Obtain
Interpretable and Faithful Models [77.34726150561087]
We define explainability through the interpretability of the explanations and the faithfulness of the explainability model in the field of process outcome prediction.
This paper contributes a set of guidelines named X-MOP which allows selecting the appropriate model based on the event log specifications.
arXiv Detail & Related papers (2022-03-30T05:59:50Z) - Diagnostics-Guided Explanation Generation [32.97930902104502]
Explanations shed light on a machine learning model's rationales and can aid in identifying deficiencies in its reasoning process.
We show how to optimise for several diagnostic properties when training a model to generate sentence-level explanations.
arXiv Detail & Related papers (2021-09-08T16:27:52Z) - Local Explanation of Dialogue Response Generation [77.68077106724522]
Local explanation of response generation (LERG) is proposed to gain insights into the reasoning process of a generation model.
LERG views the sequence prediction as uncertainty estimation of a human response and then creates explanations by perturbing the input and calculating the certainty change over the human response.
Our results show that our method consistently improves other widely used methods on proposed automatic- and human- evaluation metrics for this new task by 4.4-12.8%.
arXiv Detail & Related papers (2021-06-11T17:58:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.