Towards Consistent Natural-Language Explanations via
Explanation-Consistency Finetuning
- URL: http://arxiv.org/abs/2401.13986v1
- Date: Thu, 25 Jan 2024 07:04:30 GMT
- Title: Towards Consistent Natural-Language Explanations via
Explanation-Consistency Finetuning
- Authors: Yanda Chen, Chandan Singh, Xiaodong Liu, Simiao Zuo, Bin Yu, He He,
Jianfeng Gao
- Abstract summary: Large language models (LLMs) often generate convincing, fluent explanations.
They often generate inconsistent explanations on different inputs.
We propose explanation-consistency finetuning (EC-finetuning) to generate consistent natural-language explanations.
- Score: 66.87754065127714
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) often generate convincing, fluent explanations.
However, different from humans, they often generate inconsistent explanations
on different inputs. For example, an LLM may generate the explanation "all
birds can fly" when answering the question "Can sparrows fly?" but meanwhile
answer "no" to the related question "Can penguins fly?". Explanations should be
consistent across related examples so that they allow a human to simulate the
LLM's decision process on multiple examples. We propose explanation-consistency
finetuning (EC-finetuning), a method that adapts LLMs to generate more
consistent natural-language explanations on related examples. EC-finetuning
involves finetuning LLMs on synthetic data that is carefully constructed to
contain consistent explanations. Across a variety of question-answering
datasets in various domains, EC-finetuning yields a 10.0% relative explanation
consistency improvement on four finetuning datasets, and generalizes to seven
out-of-distribution datasets not seen during finetuning (+4.5% relative). Code
is available at https://github.com/yandachen/explanation-consistency-finetuning .
Related papers
- From Distributional to Overton Pluralism: Investigating Large Language Model Alignment [82.99849359892112]
We re-examine previously reported reductions in response diversity post-alignment.
Our analysis suggests that an apparent drop in the diversity of responses is largely explained by quality control and information aggregation.
Findings indicate that current alignment techniques capture but do not extend the useful subset of assistant-like base LLM behavior.
arXiv Detail & Related papers (2024-06-25T16:32:33Z) - Can Language Models Explain Their Own Classification Behavior? [1.8177391253202122]
Large language models (LLMs) perform well at a myriad of tasks, but explaining the processes behind this performance is a challenge.
This paper investigates whether LLMs can give faithful high-level explanations of their own internal processes.
We release our dataset, ArticulateRules, which can be used to test self-explanation for LLMs trained either in-context or by finetuning.
arXiv Detail & Related papers (2024-05-13T02:31:08Z) - FaithLM: Towards Faithful Explanations for Large Language Models [67.29893340289779]
Large Language Models (LLMs) have become proficient in addressing complex tasks by leveraging their internal knowledge and reasoning capabilities.
The black-box nature of these models complicates the task of explaining their decision-making processes.
We introduce FaithLM to explain the decision of LLMs with natural language (NL) explanations.
arXiv Detail & Related papers (2024-02-07T09:09:14Z) - Do Models Explain Themselves? Counterfactual Simulatability of Natural
Language Explanations [62.61495090463084]
Large language models (LLMs) are trained to imitate humans to explain human decisions.
We evaluate whether an explanation can enable humans to precisely infer the model's outputs on diverse counterfactuals.
We found that LLM's explanations have low precision and that precision does not correlate with plausibility.
arXiv Detail & Related papers (2023-07-17T17:41:47Z) - Explanation-based Finetuning Makes Models More Robust to Spurious Cues [21.327036110196637]
Large Language Models (LLMs) are so powerful that they sometimes learn correlations between labels and features that are irrelevant to the task.
We propose explanation-based finetuning as a general approach to mitigate LLMs' reliance on spurious correlations.
We finetune the model to additionally generate a free-text explanation supporting its answer.
arXiv Detail & Related papers (2023-05-08T18:53:45Z) - Explanation Selection Using Unlabeled Data for Chain-of-Thought
Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance.
This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z) - ExaRanker: Explanation-Augmented Neural Ranker [67.4894325619275]
In this work, we show that neural rankers also benefit from explanations.
We use LLMs such as GPT-3.5 to augment retrieval datasets with explanations.
Our model, dubbed ExaRanker, finetuned on a few thousand examples with synthetic explanations performs on par with models finetuned on 3x more examples without explanations.
arXiv Detail & Related papers (2023-01-25T11:03:04Z) - Improving Neural Model Performance through Natural Language Feedback on
Their Explanations [38.96890526935312]
We introduce MERCURIE - an interactive system that refines its explanations for a given reasoning task by getting human feedback in natural language.
Our approach generates graphs that 40% have fewer inconsistencies as compared with the off-the-shelf system.
arXiv Detail & Related papers (2021-04-18T08:10:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.