Related papers: General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models

General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models

URL: http://arxiv.org/abs/2007.04131v2
Date: Tue, 17 Aug 2021 06:58:16 GMT
Title: General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models
Authors: Christoph Molnar, Gunnar K\"onig, Julia Herbinger, Timo Freiesleben, Susanne Dandl, Christian A. Scholbeck, Giuseppe Casalicchio, Moritz Grosse-Wentrup, Bernd Bischl
Abstract summary: We highlight many general pitfalls of machine learning model interpretation, such as using interpretation techniques in the wrong context. We focus on pitfalls for global methods that describe the average model behavior, but many pitfalls also apply to local methods that explain individual predictions.
Score: 1.025459377812322
License: http://creativecommons.org/licenses/by/4.0/
Abstract: An increasing number of model-agnostic interpretation techniques for machine learning (ML) models such as partial dependence plots (PDP), permutation feature importance (PFI) and Shapley values provide insightful model interpretations, but can lead to wrong conclusions if applied incorrectly. We highlight many general pitfalls of ML model interpretation, such as using interpretation techniques in the wrong context, interpreting models that do not generalize well, ignoring feature dependencies, interactions, uncertainty estimates and issues in high-dimensional settings, or making unjustified causal interpretations, and illustrate them with examples. We focus on pitfalls for global methods that describe the average model behavior, but many pitfalls also apply to local methods that explain individual predictions. Our paper addresses ML practitioners by raising awareness of pitfalls and identifying solutions for correct model interpretation, but also addresses ML researchers by discussing open issues for further research.

Related papers

A Method for Evaluating the Interpretability of Machine Learning Models in Predicting Bond Default Risk Based on LIME and SHAP [7.7133862848321835]
This paper uses bond market default prediction as a case study, applying commonly used machine learning algorithms within AI models. The results of this analysis are consistent with the intuitive understanding and logical expectations regarding the interpretability of these models.
arXiv Detail & Related papers (2025-02-26T23:05:34Z)
Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling. Yet their widespread adoption poses challenges regarding data attribution and interpretability. In this paper, we aim to help address such challenges by developing an textitinfluence functions framework.
arXiv Detail & Related papers (2024-10-17T17:59:02Z)
Hard to Explain: On the Computational Hardness of In-Distribution Model Interpretation [0.9558392439655016]
The ability to interpret Machine Learning (ML) models is becoming increasingly essential. Recent work has demonstrated that it is possible to formally assess interpretability by studying the computational complexity of explaining the decisions of various models.
arXiv Detail & Related papers (2024-08-07T17:20:52Z)
A Guide to Feature Importance Methods for Scientific Inference [10.31256905045161]
Feature importance (FI) methods provide useful insights into the data-generating process (DGP) This paper serves as a comprehensive guide to help understand the different interpretations of global FI methods.
arXiv Detail & Related papers (2024-04-19T13:01:59Z)
Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z)
GAM(e) changer or not? An evaluation of interpretable machine learning models based on additive model constraints [5.783415024516947]
This paper investigates a series of intrinsically interpretable machine learning models. We evaluate the prediction qualities of five GAMs as compared to six traditional ML models.
arXiv Detail & Related papers (2022-04-19T20:37:31Z)
Tree-based local explanations of machine learning model predictions, AraucanaXAI [2.9660372210786563]
A tradeoff between performance and intelligibility is often to be faced, especially in high-stakes applications like medicine. We propose a novel methodological approach for generating explanations of the predictions of a generic ML model.
arXiv Detail & Related papers (2021-10-15T17:39:19Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
Understanding Interpretability by generalized distillation in Supervised Classification [3.5473853445215897]
Recent interpretation strategies focus on human understanding of the underlying decision mechanisms of the complex Machine Learning models. We propose an interpretation-by-distillation formulation that is defined relative to other ML models. We evaluate our proposed framework on the MNIST, Fashion-MNIST and Stanford40 datasets.
arXiv Detail & Related papers (2020-12-05T17:42:50Z)
Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented. It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)
Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets. Part of the challenge of learning robust models lies in the influence of unobserved confounders. We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z)
Evaluating the Disentanglement of Deep Generative Models through Manifold Topology [66.06153115971732]
We present a method for quantifying disentanglement that only uses the generative model. We empirically evaluate several state-of-the-art models across multiple datasets.
arXiv Detail & Related papers (2020-06-05T20:54:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.