Show Me How It's Done: The Role of Explanations in Fine-Tuning Language
Models
- URL: http://arxiv.org/abs/2402.07543v1
- Date: Mon, 12 Feb 2024 10:11:50 GMT
- Title: Show Me How It's Done: The Role of Explanations in Fine-Tuning Language
Models
- Authors: Mohamad Ballout, Ulf Krumnack, Gunther Heidemann and Kai-Uwe
Kuehnberger
- Abstract summary: We show the significant benefits of using fine-tuning with explanations to enhance the performance of language models.
We found that even smaller language models with as few as 60 million parameters benefited substantially from this approach.
- Score: 0.45060992929802207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our research demonstrates the significant benefits of using fine-tuning with
explanations to enhance the performance of language models. Unlike prompting,
which maintains the model's parameters, fine-tuning allows the model to learn
and update its parameters during a training phase. In this study, we applied
fine-tuning to various sized language models using data that contained
explanations of the output rather than merely presenting the answers. We found
that even smaller language models with as few as 60 million parameters
benefited substantially from this approach. Interestingly, our results
indicated that the detailed explanations were more beneficial to smaller models
than larger ones, with the latter gaining nearly the same advantage from any
form of explanation, irrespective of its length. Additionally, we demonstrate
that the inclusion of explanations enables the models to solve tasks that they
were not able to solve without explanations. Lastly, we argue that despite the
challenging nature of adding explanations, samples that contain explanations
not only reduce the volume of data required for training but also promote a
more effective generalization by the model. In essence, our findings suggest
that fine-tuning with explanations significantly bolsters the performance of
large language models.
Related papers
- The Effect of Model Size on LLM Post-hoc Explainability via LIME [1.1073658091405039]
This work explores LIME explanations for DeBERTaV3 models of four different sizes on natural language inference tasks.
We evaluate the explanations based on their faithfulness to the models' internal decision processes and their plausibility.
The key finding is that increased model size does not correlate with plausibility despite improved model performance.
arXiv Detail & Related papers (2024-05-08T18:27:20Z) - Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models.
We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z) - ExaRanker: Explanation-Augmented Neural Ranker [67.4894325619275]
In this work, we show that neural rankers also benefit from explanations.
We use LLMs such as GPT-3.5 to augment retrieval datasets with explanations.
Our model, dubbed ExaRanker, finetuned on a few thousand examples with synthetic explanations performs on par with models finetuned on 3x more examples without explanations.
arXiv Detail & Related papers (2023-01-25T11:03:04Z) - Explanations from Large Language Models Make Small Reasoners Better [61.991772773700006]
We show that our method can consistently and significantly outperform finetuning baselines across different settings.
As a side benefit, human evaluation shows that our method can generate high-quality explanations to justify its predictions.
arXiv Detail & Related papers (2022-10-13T04:50:02Z) - Can language models learn from explanations in context? [21.67788893486215]
Large language models can perform new tasks by adapting to a few in-context examples.
For humans, rapid learning from examples can benefit from explanations that connect examples to task principles.
We investigate whether explanations of few-shot examples can allow language models to adapt more effectively.
arXiv Detail & Related papers (2022-04-05T16:33:44Z) - Interpreting Language Models with Contrastive Explanations [99.7035899290924]
Language models must consider various features to predict a token, such as its part of speech, number, tense, or semantics.
Existing explanation methods conflate evidence for all these features into a single explanation, which is less interpretable for human understanding.
We show that contrastive explanations are quantifiably better than non-contrastive explanations in verifying major grammatical phenomena.
arXiv Detail & Related papers (2022-02-21T18:32:24Z) - When Can Models Learn From Explanations? A Formal Framework for
Understanding the Roles of Explanation Data [84.87772675171412]
We study the circumstances under which explanations of individual data points can improve modeling performance.
We make use of three existing datasets with explanations: e-SNLI, TACRED, SemEval.
arXiv Detail & Related papers (2021-02-03T18:57:08Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.