Inference to the Best Explanation in Large Language Models
- URL: http://arxiv.org/abs/2402.10767v1
- Date: Fri, 16 Feb 2024 15:41:23 GMT
- Title: Inference to the Best Explanation in Large Language Models
- Authors: Dhairya Dalal, Marco Valentino, Andr\'e Freitas, and Paul Buitelaar
- Abstract summary: This paper proposes IBE-Eval, a framework inspired by philosophical accounts on Inference to the Best Explanation (IBE)
IBE-Eval estimates the plausibility of natural language explanations through a combination of explicit logical and linguistic features.
Experiments reveal that IBE-Eval can successfully identify the best explanation with up to 77% accuracy.
- Score: 6.037970847418495
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While Large Language Models (LLMs) have found success in real-world
applications, their underlying explanatory process is still poorly understood.
This paper proposes IBE-Eval, a framework inspired by philosophical accounts on
Inference to the Best Explanation (IBE) to advance the interpretation and
evaluation of LLMs' explanations. IBE-Eval estimates the plausibility of
natural language explanations through a combination of explicit logical and
linguistic features including: consistency, parsimony, coherence, and
uncertainty. Extensive experiments are conducted on Causal Question Answering
(CQA), where \textit{IBE-Eval} is tasked to select the most plausible causal
explanation amongst competing ones generated by LLMs (i.e., GPT 3.5 and Llama
2). The experiments reveal that IBE-Eval can successfully identify the best
explanation with up to 77\% accuracy ($\approx 27\%$ above random), improving
upon a GPT 3.5-as-a-Judge baseline ($\approx+17\%$) while being intrinsically
more efficient and interpretable. Additional analyses suggest that, despite
model-specific variances, LLM-generated explanations tend to conform to IBE
criteria and that IBE-Eval is significantly correlated with human judgment,
opening up opportunities for future development of automated explanation
verification tools.
Related papers
- Evaluating the Reliability of Self-Explanations in Large Language Models [2.8894038270224867]
We evaluate two kinds of such self-explanations - extractive and counterfactual.
Our findings reveal, that, while these self-explanations can correlate with human judgement, they do not fully and accurately follow the model's decision process.
We show that this gap can be bridged because prompting LLMs for counterfactual explanations can produce faithful, informative, and easy-to-verify results.
arXiv Detail & Related papers (2024-07-19T17:41:08Z) - RVISA: Reasoning and Verification for Implicit Sentiment Analysis [18.836998294161834]
implicit sentiment analysis (ISA) poses a significant challenge with the absence of salient cue words in expressions.
This study proposes RVISA, a two-stage reasoning framework that harnesses the generation ability of DO LLMs and the reasoning ability of ED LLMs to train an enhanced reasoner.
arXiv Detail & Related papers (2024-07-02T15:07:54Z) - Evaluating Human Alignment and Model Faithfulness of LLM Rationale [66.75309523854476]
We study how well large language models (LLMs) explain their generations through rationales.
We show that prompting-based methods are less "faithful" than attribution-based explanations.
arXiv Detail & Related papers (2024-06-28T20:06:30Z) - Evaluating Consistency and Reasoning Capabilities of Large Language Models [0.0]
Large Language Models (LLMs) are extensively used today across various sectors, including academia, research, business, and finance.
Despite their widespread adoption, these models often produce incorrect and misleading information, exhibiting a tendency to hallucinate.
This paper aims to evaluate and compare the consistency and reasoning capabilities of both public and proprietary LLMs.
arXiv Detail & Related papers (2024-04-25T10:03:14Z) - A Hypothesis-Driven Framework for the Analysis of Self-Rationalising
Models [0.8702432681310401]
We use a Bayesian network to implement a hypothesis about how a task is solved.
The resulting models do not exhibit a strong similarity to GPT-3.5.
We discuss the implications of this as well as the framework's potential to approximate LLM decisions better in future work.
arXiv Detail & Related papers (2024-02-07T12:26:12Z) - FaithLM: Towards Faithful Explanations for Large Language Models [67.29893340289779]
Large Language Models (LLMs) have become proficient in addressing complex tasks by leveraging their internal knowledge and reasoning capabilities.
The black-box nature of these models complicates the task of explaining their decision-making processes.
We introduce FaithLM to explain the decision of LLMs with natural language (NL) explanations.
arXiv Detail & Related papers (2024-02-07T09:09:14Z) - LogicAsker: Evaluating and Improving the Logical Reasoning Ability of Large Language Models [63.14196038655506]
We introduce LogicAsker, a novel approach for evaluating and enhancing the logical reasoning capabilities of large language models (LLMs)
Our methodology reveals significant gaps in LLMs' learning of logical rules, with identified reasoning failures ranging from 29% to 90% across different models.
We leverage these findings to construct targeted demonstration examples and fine-tune data, notably enhancing logical reasoning in models like GPT-4o by up to 5%.
arXiv Detail & Related papers (2024-01-01T13:53:53Z) - Explanation-aware Soft Ensemble Empowers Large Language Model In-context
Learning [50.00090601424348]
Large language models (LLMs) have shown remarkable capabilities in various natural language understanding tasks.
We propose EASE, an Explanation-Aware Soft Ensemble framework to empower in-context learning with LLMs.
arXiv Detail & Related papers (2023-11-13T06:13:38Z) - Explanations from Large Language Models Make Small Reasoners Better [61.991772773700006]
We show that our method can consistently and significantly outperform finetuning baselines across different settings.
As a side benefit, human evaluation shows that our method can generate high-quality explanations to justify its predictions.
arXiv Detail & Related papers (2022-10-13T04:50:02Z) - The Unreliability of Explanations in Few-Shot In-Context Learning [50.77996380021221]
We focus on two NLP tasks that involve reasoning over text, namely question answering and natural language inference.
We show that explanations judged as good by humans--those that are logically consistent with the input--usually indicate more accurate predictions.
We present a framework for calibrating model predictions based on the reliability of the explanations.
arXiv Detail & Related papers (2022-05-06T17:57:58Z) - Do Natural Language Explanations Represent Valid Logical Arguments?
Verifying Entailment in Explainable NLI Gold Standards [0.0]
An emerging line of research in Explainable NLP is the creation of datasets enriched with human-annotated explanations and rationales.
While human-annotated explanations are used as ground-truth for the inference, there is a lack of systematic assessment of their consistency and rigour.
We propose a systematic annotation methodology, named Explanation Entailment Verification (EEV), to quantify the logical validity of human-annotated explanations.
arXiv Detail & Related papers (2021-05-05T10:59:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.