Related papers: Using Decision Tree as Local Interpretable Model in Autoencoder-based LIME

Using Decision Tree as Local Interpretable Model in Autoencoder-based LIME

URL: http://arxiv.org/abs/2204.03321v1
Date: Thu, 7 Apr 2022 09:39:02 GMT
Title: Using Decision Tree as Local Interpretable Model in Autoencoder-based LIME
Authors: Niloofar Ranjbar and Reza Safabakhsh
Abstract summary: We present a modified version of an autoencoder-based approach for local interpretability called ALIME. This work proposes a new approach, which uses a decision tree instead of the linear model, as the interpretable model. Compared to ALIME, the experiments show significant results on stability and local fidelity and improved results on interpretability.
Score: 0.76146285961466
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Nowadays, deep neural networks are being used in many domains because of their high accuracy results. However, they are considered as "black box", means that they are not explainable for humans. On the other hand, in some tasks such as medical, economic, and self-driving cars, users want the model to be interpretable to decide if they can trust these results or not. In this work, we present a modified version of an autoencoder-based approach for local interpretability called ALIME. The ALIME itself is inspired by a famous method called Local Interpretable Model-agnostic Explanations (LIME). LIME generates a single instance level explanation by generating new data around the instance and training a local linear interpretable model. ALIME uses an autoencoder to weigh the new data around the sample. Nevertheless, the ALIME uses a linear model as the interpretable model to be trained locally, just like the LIME. This work proposes a new approach, which uses a decision tree instead of the linear model, as the interpretable model. We evaluate the proposed model in case of stability, local fidelity, and interpretability on different datasets. Compared to ALIME, the experiments show significant results on stability and local fidelity and improved results on interpretability.

Related papers

Predicting the Performance of Black-box LLMs through Self-Queries [60.87193950962585]
Large language models (LLMs) are increasingly relied on in AI systems, predicting when they make mistakes is crucial. In this paper, we extract features of LLMs in a black-box manner by using follow-up prompts and taking the probabilities of different responses as representations. We demonstrate that training a linear model on these low-dimensional representations produces reliable predictors of model performance at the instance level.
arXiv Detail & Related papers (2025-01-02T22:26:54Z)
MASALA: Model-Agnostic Surrogate Explanations by Locality Adaptation [3.587367153279351]
Existing local Explainable AI (XAI) methods select a region of the input space in the vicinity of a given input instance, for which they approximate the behaviour of a model using a simpler and more interpretable surrogate model. We propose a novel method, MASALA, for generating explanations, which automatically determines the appropriate local region of impactful model behaviour for each individual instance being explained.
arXiv Detail & Related papers (2024-08-19T15:26:45Z)
Distilling BlackBox to Interpretable models for Efficient Transfer Learning [19.40897632956169]
Building generalizable AI models is one of the primary challenges in the healthcare domain. Fine-tuning a model to transfer knowledge from one domain to another requires a significant amount of labeled data in the target domain. We develop an interpretable model that can be efficiently fine-tuned to an unseen target domain with minimal computational cost.
arXiv Detail & Related papers (2023-05-26T23:23:48Z)
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca [62.65877150123775]
We use Boundless DAS to efficiently search for interpretable causal structure in large language models while they follow instructions. Our findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models.
arXiv Detail & Related papers (2023-05-15T17:15:40Z)
An Interpretable Loan Credit Evaluation Method Based on Rule Representation Learner [8.08640000394814]
We design an intrinsically interpretable model based on RRL(Rule Representation) for the Lending Club dataset. During the training, we learned tricks from previous research to effectively train binary weights. Our model is used to test the correctness of the explanations generated by the post-hoc method.
arXiv Detail & Related papers (2023-04-03T05:55:04Z)
Utilizing XAI technique to improve autoencoder based model for computer network anomaly detection with shapley additive explanation(SHAP) [0.0]
Machine learning (ML) and Deep Learning (DL) methods are being adopted rapidly, especially in computer network security. Lack of transparency of ML and DL based models is a major obstacle to their implementation and criticized due to its black-box nature. XAI is a promising area that can improve the trustworthiness of these models by giving explanations and interpreting its output.
arXiv Detail & Related papers (2021-12-14T09:42:04Z)
Distributionally Robust Recurrent Decoders with Random Network Distillation [93.10261573696788]
We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to disregard OOD context during inference. We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.
arXiv Detail & Related papers (2021-10-25T19:26:29Z)
Distilling Interpretable Models into Human-Readable Code [71.11328360614479]
Human-readability is an important and desirable standard for machine-learned model interpretability. We propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code. We describe a piecewise-linear curve-fitting algorithm that produces high-quality results efficiently and reliably across a broad range of use cases.
arXiv Detail & Related papers (2021-01-21T01:46:36Z)
Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task. The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them. By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z)
Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language? [86.60613602337246]
We introduce a leakage-adjusted simulatability (LAS) metric for evaluating NL explanations. LAS measures how well explanations help an observer predict a model's output, while controlling for how explanations can directly leak the output. We frame explanation generation as a multi-agent game and optimize explanations for simulatability while penalizing label leakage.
arXiv Detail & Related papers (2020-10-08T16:59:07Z)
VAE-LIME: Deep Generative Model Based Approach for Local Data-Driven Model Interpretability Applied to the Ironmaking Industry [70.10343492784465]
It is necessary to expose to the process engineer, not solely the model predictions, but also their interpretability. Model-agnostic local interpretability solutions based on LIME have recently emerged to improve the original method. We present in this paper a novel approach, VAE-LIME, for local interpretability of data-driven models forecasting the temperature of the hot metal produced by a blast furnace.
arXiv Detail & Related papers (2020-07-15T07:07:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.