ReLACE: Reinforcement Learning Agent for Counterfactual Explanations of
Arbitrary Predictive Models
- URL: http://arxiv.org/abs/2110.11960v1
- Date: Fri, 22 Oct 2021 17:08:49 GMT
- Title: ReLACE: Reinforcement Learning Agent for Counterfactual Explanations of
Arbitrary Predictive Models
- Authors: Ziheng Chen, Fabrizio Silvestri, Gabriele Tolomei, He Zhu, Jia Wang,
Hongshik Ahn
- Abstract summary: We introduce a model-agnostic algorithm to generate optimal counterfactual explanations.
Our method is easily applied to any black-box model, as this resembles the environment that the DRL agent interacts with.
In addition, we develop an algorithm to extract explainable decision rules from the DRL agent's policy, so as to make the process of generating CFs itself transparent.
- Score: 6.939617874336667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The demand for explainable machine learning (ML) models has been growing
rapidly in recent years. Amongst the methods proposed to associate ML model
predictions with human-understandable rationale, counterfactual explanations
are one of the most popular. They consist of post-hoc rules derived from
counterfactual examples (CFs), i.e., modified versions of input samples that
result in alternative output responses from the predictive model to be
explained. However, existing CF generation strategies either exploit the
internals of specific models (e.g., random forests or neural networks), or
depend on each sample's neighborhood, which makes them hard to be generalized
for more complex models and inefficient for larger datasets. In this work, we
aim to overcome these limitations and introduce a model-agnostic algorithm to
generate optimal counterfactual explanations. Specifically, we formulate the
problem of crafting CFs as a sequential decision-making task and then find the
optimal CFs via deep reinforcement learning (DRL) with discrete-continuous
hybrid action space. Differently from other techniques, our method is easily
applied to any black-box model, as this resembles the environment that the DRL
agent interacts with. In addition, we develop an algorithm to extract
explainable decision rules from the DRL agent's policy, so as to make the
process of generating CFs itself transparent. Extensive experiments conducted
on several datasets have shown that our method outperforms existing CF
generation baselines.
Related papers
- Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.
Yet their widespread adoption poses challenges regarding data attribution and interpretability.
In this paper, we aim to help address such challenges by developing an textitinfluence functions framework.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review [63.31328039424469]
This tutorial provides a comprehensive survey of methods for fine-tuning diffusion models to optimize downstream reward functions.
We explain the application of various RL algorithms, including PPO, differentiable optimization, reward-weighted MLE, value-weighted sampling, and path consistency learning.
arXiv Detail & Related papers (2024-07-18T17:35:32Z) - Learning Car-Following Behaviors Using Bayesian Matrix Normal Mixture Regression [17.828808886958736]
Car-following (CF) behaviors are crucial for microscopic traffic simulation.
Many data-driven methods, despite their robustness, operate as "black boxes" with limited interpretability.
This work introduces a Bayesian Matrix Normal Mixture Regression (MNMR) model that simultaneously captures feature correlations and temporal dynamics inherent in CF behaviors.
arXiv Detail & Related papers (2024-04-24T17:55:47Z) - Faithful Explanations of Black-box NLP Models Using LLM-generated
Counterfactuals [67.64770842323966]
Causal explanations of predictions of NLP systems are essential to ensure safety and establish trust.
Existing methods often fall short of explaining model predictions effectively or efficiently.
We propose two approaches for counterfactual (CF) approximation.
arXiv Detail & Related papers (2023-10-01T07:31:04Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - DualCF: Efficient Model Extraction Attack from Counterfactual
Explanations [57.46134660974256]
Cloud service providers have launched Machine-Learning-as-a-Service platforms to allow users to access large-scale cloudbased models via APIs.
Such extra information inevitably causes the cloud models to be more vulnerable to extraction attacks.
We propose a novel simple yet efficient querying strategy to greatly enhance the querying efficiency to steal a classification model.
arXiv Detail & Related papers (2022-05-13T08:24:43Z) - CounterNet: End-to-End Training of Prediction Aware Counterfactual
Explanations [12.313007847721215]
CounterNet is an end-to-end learning framework which integrates predictive model training and the generation of counterfactual (CF) explanations.
Unlike post-hoc methods, CounterNet enables the optimization of the CF explanation generation only once together with the predictive model.
Our experiments on multiple real-world datasets show that CounterNet generates high-quality predictions.
arXiv Detail & Related papers (2021-09-15T20:09:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.