Are Training Resources Insufficient? Predict First Then Explain!
- URL: http://arxiv.org/abs/2110.02056v1
- Date: Sun, 29 Aug 2021 07:04:50 GMT
- Title: Are Training Resources Insufficient? Predict First Then Explain!
- Authors: Myeongjun Jang and Thomas Lukasiewicz
- Abstract summary: We argue that the predict-then-explain (PtE) architecture is a more efficient approach in terms of the modelling perspective.
We show that the PtE structure is the most data-efficient approach when explanation data are lacking.
- Score: 54.184609286094044
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Natural language free-text explanation generation is an efficient approach to
train explainable language processing models for
commonsense-knowledge-requiring tasks. The most predominant form of these
models is the explain-then-predict (EtP) structure, which first generates
explanations and uses them for making decisions. The performance of EtP models
is highly dependent on that of the explainer by the nature of their structure.
Therefore, large-sized explanation data are required to train a good explainer
model. However, annotating explanations is expensive. Also, recent works reveal
that free-text explanations might not convey sufficient information for
decision making. These facts cast doubts on the effectiveness of EtP models. In
this paper, we argue that the predict-then-explain (PtE) architecture is a more
efficient approach in terms of the modelling perspective. Our main contribution
is twofold. First, we show that the PtE structure is the most data-efficient
approach when explanation data are lacking. Second, we reveal that the PtE
structure is always more training-efficient than the EtP structure. We also
provide experimental results that confirm the theoretical advantages.
Related papers
- XForecast: Evaluating Natural Language Explanations for Time Series Forecasting [72.57427992446698]
Time series forecasting aids decision-making, especially for stakeholders who rely on accurate predictions.
Traditional explainable AI (XAI) methods, which underline feature or temporal importance, often require expert knowledge.
evaluating forecast NLEs is difficult due to the complex causal relationships in time series data.
arXiv Detail & Related papers (2024-10-18T05:16:39Z) - Explainability for Machine Learning Models: From Data Adaptability to
User Perception [0.8702432681310401]
This thesis explores the generation of local explanations for already deployed machine learning models.
It aims to identify optimal conditions for producing meaningful explanations considering both data and user requirements.
arXiv Detail & Related papers (2024-02-16T18:44:37Z) - Show Me How It's Done: The Role of Explanations in Fine-Tuning Language
Models [0.45060992929802207]
We show the significant benefits of using fine-tuning with explanations to enhance the performance of language models.
We found that even smaller language models with as few as 60 million parameters benefited substantially from this approach.
arXiv Detail & Related papers (2024-02-12T10:11:50Z) - Explaining the Model and Feature Dependencies by Decomposition of the
Shapley Value [3.0655581300025996]
Shapley values have become one of the go-to methods to explain complex models to end-users.
One downside is that they always require outputs of the model when some features are missing.
This however introduces a non-trivial choice: do we condition on the unknown features or not?
We propose a new algorithmic approach to combine both explanations, removing the burden of choice and enhancing the explanatory power of Shapley values.
arXiv Detail & Related papers (2023-06-19T12:20:23Z) - Robust Ante-hoc Graph Explainer using Bilevel Optimization [0.7999703756441758]
We propose RAGE, a novel and flexible ante-hoc explainer for graph neural networks.
RAGE can effectively identify molecular substructures that contain the full information needed for prediction.
Our experiments on various molecular classification tasks show that RAGE explanations are better than existing post-hoc and ante-hoc approaches.
arXiv Detail & Related papers (2023-05-25T05:50:38Z) - BELLA: Black box model Explanations by Local Linear Approximations [10.05944106581306]
We present BELLA, a deterministic model-agnostic post-hoc approach for explaining the individual predictions of regression black-box models.
BELLA provides explanations in the form of a linear model trained in the feature space.
BELLA can produce both factual and counterfactual explanations.
arXiv Detail & Related papers (2023-05-18T21:22:23Z) - Learning to Scaffold: Optimizing Model Explanations for Teaching [74.25464914078826]
We train models on three natural language processing and computer vision tasks.
We find that students trained with explanations extracted with our framework are able to simulate the teacher significantly more effectively than ones produced with previous methods.
arXiv Detail & Related papers (2022-04-22T16:43:39Z) - Towards Interpretable Natural Language Understanding with Explanations
as Latent Variables [146.83882632854485]
We develop a framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training.
Our framework treats natural language explanations as latent variables that model the underlying reasoning process of a neural model.
arXiv Detail & Related papers (2020-10-24T02:05:56Z) - Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial
Explanations of Their Behavior in Natural Language? [86.60613602337246]
We introduce a leakage-adjusted simulatability (LAS) metric for evaluating NL explanations.
LAS measures how well explanations help an observer predict a model's output, while controlling for how explanations can directly leak the output.
We frame explanation generation as a multi-agent game and optimize explanations for simulatability while penalizing label leakage.
arXiv Detail & Related papers (2020-10-08T16:59:07Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.