Use Perturbations when Learning from Explanations
- URL: http://arxiv.org/abs/2303.06419v3
- Date: Fri, 1 Dec 2023 14:03:15 GMT
- Title: Use Perturbations when Learning from Explanations
- Authors: Juyeon Heo, Vihari Piratla, Matthew Wicker, Adrian Weller
- Abstract summary: Machine learning from explanations (MLX) is an approach to learning that uses human-provided explanations of relevant or irrelevant features for each input.
We recast MLX as a problem, where human explanations specify a lower dimensional manifold from which robustnesss can be drawn.
We consider various approaches to achieving robustness, leading to improved performance over prior MLX methods.
- Score: 51.19736333434313
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning from explanations (MLX) is an approach to learning that uses
human-provided explanations of relevant or irrelevant features for each input
to ensure that model predictions are right for the right reasons. Existing MLX
approaches rely on local model interpretation methods and require strong model
smoothing to align model and human explanations, leading to sub-optimal
performance. We recast MLX as a robustness problem, where human explanations
specify a lower dimensional manifold from which perturbations can be drawn, and
show both theoretically and empirically how this approach alleviates the need
for strong model smoothing. We consider various approaches to achieving
robustness, leading to improved performance over prior MLX methods. Finally, we
show how to combine robustness with an earlier MLX method, yielding
state-of-the-art results on both synthetic and real-world benchmarks.
Related papers
- SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales [3.242050660144211]
Saliency post-hoc explainability methods are important tools for understanding increasingly complex NLP models.
We present a methodology for incorporating rationales, which are text annotations explaining human decisions, into text classification models.
arXiv Detail & Related papers (2024-04-03T22:39:33Z) - Prototypical Self-Explainable Models Without Re-training [5.837536154627278]
Self-explainable models (SEMs) are trained directly to provide explanations alongside their predictions.
Current SEMs require complex architectures and heavily regularized loss functions, thus necessitating specific and costly training.
We propose a simple yet efficient universal method called KMEx, which can convert any existing pre-trained model into a prototypical SEM.
arXiv Detail & Related papers (2023-12-13T01:15:00Z) - Faithful Explanations of Black-box NLP Models Using LLM-generated
Counterfactuals [67.64770842323966]
Causal explanations of predictions of NLP systems are essential to ensure safety and establish trust.
Existing methods often fall short of explaining model predictions effectively or efficiently.
We propose two approaches for counterfactual (CF) approximation.
arXiv Detail & Related papers (2023-10-01T07:31:04Z) - Model Agnostic Sample Reweighting for Out-of-Distribution Learning [38.843552982739354]
We propose a principled method, textbfAgnostic samtextbfPLe rtextbfEweighting (textbfMAPLE) to effectively address OOD problem.
Our key idea is to find an effective reweighting of the training samples so that the standard empirical risk minimization training of a large model leads to superior OOD generalization performance.
arXiv Detail & Related papers (2023-01-24T05:11:03Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - HEX: Human-in-the-loop Explainability via Deep Reinforcement Learning [2.322461721824713]
We propose HEX, a human-in-the-loop deep reinforcement learning approach to machine learning explainability (MLX)
Our formulation explicitly considers the decision boundary of the ML model in question, rather than the underlying training data.
Our proposed methods thus synthesize HITL MLX policies that explicitly capture the decision boundary of the model in question for use in limited data scenarios.
arXiv Detail & Related papers (2022-06-02T23:53:40Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - A Semiparametric Approach to Interpretable Machine Learning [9.87381939016363]
Black box models in machine learning have demonstrated excellent predictive performance in complex problems and high-dimensional settings.
Their lack of transparency and interpretability restrict the applicability of such models in critical decision-making processes.
We propose a novel approach to trading off interpretability and performance in prediction models using ideas from semiparametric statistics.
arXiv Detail & Related papers (2020-06-08T16:38:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.