On the Robustness of Counterfactual Explanations to Adverse
Perturbations
- URL: http://arxiv.org/abs/2201.09051v1
- Date: Sat, 22 Jan 2022 13:57:45 GMT
- Title: On the Robustness of Counterfactual Explanations to Adverse
Perturbations
- Authors: Marco Virgolin and Saverio Fracaros
- Abstract summary: We consider robustness to adverse perturbations, which may naturally happen due to unfortunate circumstances.
We provide two definitions of robustness, which concern, respectively, the features to change and to keep as they are.
Our experiments show that CEs are often not robust and, if adverse perturbations take place, the intervention they prescribe may require a much larger cost than anticipated.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Counterfactual explanations (CEs) are a powerful means for understanding how
decisions made by algorithms can be changed. Researchers have proposed a number
of desiderata that CEs should meet to be practically useful, such as requiring
minimal effort to enact, or complying with causal models. We consider a further
aspect to improve the usability of CEs: robustness to adverse perturbations,
which may naturally happen due to unfortunate circumstances. Since CEs
typically prescribe a sparse form of intervention (i.e., only a subset of the
features should be changed), we provide two definitions of robustness, which
concern, respectively, the features to change and to keep as they are. These
definitions are workable in that they can be incorporated as penalty terms in
the loss functions that are used for discovering CEs. To experiment with the
proposed definitions of robustness, we create and release code where five data
sets (commonly used in the field of fair and explainable machine learning) have
been enriched with feature-specific annotations that can be used to sample
meaningful perturbations. Our experiments show that CEs are often not robust
and, if adverse perturbations take place, the intervention they prescribe may
require a much larger cost than anticipated, or even become impossible.
However, accounting for robustness in the search process, which can be done
rather easily, allows discovering robust CEs systematically. Robust CEs are
resilient to adverse perturbations: additional intervention to contrast
perturbations is much less costly than for non-robust CEs. Our code is
available at: https://github.com/marcovirgolin/robust-counterfactuals
Related papers
- Refining Counterfactual Explanations With Joint-Distribution-Informed Shapley Towards Actionable Minimality [6.770853093478073]
Counterfactual explanations (CE) identify data points that closely resemble the observed data but produce different machine learning (ML) model outputs.
Existing CE methods often lack actionable efficiency because of unnecessary feature changes included within the explanations.
We propose a method that minimizes the required feature changes while maintaining the validity of CE.
arXiv Detail & Related papers (2024-10-07T18:31:19Z) - Automatically Adaptive Conformal Risk Control [49.95190019041905]
We propose a methodology for achieving approximate conditional control of statistical risks by adapting to the difficulty of test samples.
Our framework goes beyond traditional conditional risk control based on user-provided conditioning events to the algorithmic, data-driven determination of appropriate function classes for conditioning.
arXiv Detail & Related papers (2024-06-25T08:29:32Z) - Verified Training for Counterfactual Explanation Robustness under Data
Shift [18.156341188646348]
Counterfactual explanations (CEs) enhance the interpretability of machine learning models by describing what changes to an input are necessary to change its prediction to a desired class.
Existing approaches generate CEs by focusing on a single, fixed model, and do not provide any formal guarantees on the CEs' future validity.
This paper introduces VeriTraCER, an approach that jointly trains a classifier and an explainer to explicitly consider the robustness of the generated CEs to small model shifts.
arXiv Detail & Related papers (2024-03-06T15:06:16Z) - Introducing User Feedback-based Counterfactual Explanations (UFCE) [49.1574468325115]
Counterfactual explanations (CEs) have emerged as a viable solution for generating comprehensible explanations in XAI.
UFCE allows for the inclusion of user constraints to determine the smallest modifications in the subset of actionable features.
UFCE outperforms two well-known CE methods in terms of textitproximity, textitsparsity, and textitfeasibility.
arXiv Detail & Related papers (2024-02-26T20:09:44Z) - Provably Robust and Plausible Counterfactual Explanations for Neural Networks via Robust Optimisation [19.065904250532995]
We propose Provably RObust and PLAusible Counterfactual Explanations (PROPLACE)
We formulate an iterative algorithm to compute provably robust CEs and prove its convergence, soundness and completeness.
We show that PROPLACE achieves state-of-the-art performances against metrics on three evaluation aspects.
arXiv Detail & Related papers (2023-09-22T00:12:09Z) - Flexible and Robust Counterfactual Explanations with Minimal Satisfiable
Perturbations [56.941276017696076]
We propose a conceptually simple yet effective solution named Counterfactual Explanations with Minimal Satisfiable Perturbations (CEMSP)
CEMSP constrains changing values of abnormal features with the help of their semantically meaningful normal ranges.
Compared to existing methods, we conduct comprehensive experiments on both synthetic and real-world datasets to demonstrate that our method provides more robust explanations while preserving flexibility.
arXiv Detail & Related papers (2023-09-09T04:05:56Z) - Calibrated Explanations: with Uncertainty Information and
Counterfactuals [0.1843404256219181]
Calibrated Explanations (CE) is built on the foundation of Venn-Abers.
It provides uncertainty quantification for both feature weights and the model's probability estimates.
Results from an evaluation with 25 benchmark datasets underscore the efficacy of CE.
arXiv Detail & Related papers (2023-05-03T17:52:41Z) - Generating robust counterfactual explanations [60.32214822437734]
The quality of a counterfactual depends on several criteria: realism, actionability, validity, robustness, etc.
In this paper, we are interested in the notion of robustness of a counterfactual. More precisely, we focus on robustness to counterfactual input changes.
We propose a new framework, CROCO, that generates robust counterfactuals while managing effectively this trade-off, and guarantees the user a minimal robustness.
arXiv Detail & Related papers (2023-04-24T09:00:31Z) - Robustness and Accuracy Could Be Reconcilable by (Proper) Definition [109.62614226793833]
The trade-off between robustness and accuracy has been widely studied in the adversarial literature.
We find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance.
By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty.
arXiv Detail & Related papers (2022-02-21T10:36:09Z) - Provable tradeoffs in adversarially robust classification [96.48180210364893]
We develop and leverage new tools, including recent breakthroughs from probability theory on robust isoperimetry.
Our results reveal fundamental tradeoffs between standard and robust accuracy that grow when data is imbalanced.
arXiv Detail & Related papers (2020-06-09T09:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.