RobustX: Robust Counterfactual Explanations Made Easy
- URL: http://arxiv.org/abs/2502.13751v1
- Date: Wed, 19 Feb 2025 14:12:01 GMT
- Title: RobustX: Robust Counterfactual Explanations Made Easy
- Authors: Junqi Jiang, Luca Marzari, Aaryan Purohit, Francesco Leofante,
- Abstract summary: This paper introduces RobustX, an open-source Python library implementing a collection of CE generation and evaluation methods.
It provides interfaces to several existing methods, enabling streamlined access to state-of-the-art techniques.
- Score: 4.875355171029671
- License:
- Abstract: The increasing use of Machine Learning (ML) models to aid decision-making in high-stakes industries demands explainability to facilitate trust. Counterfactual Explanations (CEs) are ideally suited for this, as they can offer insights into the predictions of an ML model by illustrating how changes in its input data may lead to different outcomes. However, for CEs to realise their explanatory potential, significant challenges remain in ensuring their robustness under slight changes in the scenario being explained. Despite the widespread recognition of CEs' robustness as a fundamental requirement, a lack of standardised tools and benchmarks hinders a comprehensive and effective comparison of robust CE generation methods. In this paper, we introduce RobustX, an open-source Python library implementing a collection of CE generation and evaluation methods, with a focus on the robustness property. RobustX provides interfaces to several existing methods from the literature, enabling streamlined access to state-of-the-art techniques. The library is also easily extensible, allowing fast prototyping of novel robust CE generation and evaluation methods.
Related papers
- Rigorous Probabilistic Guarantees for Robust Counterfactual Explanations [80.86128012438834]
We show for the first time that computing the robustness of counterfactuals with respect to plausible model shifts is NP-complete.
We propose a novel probabilistic approach which is able to provide tight estimates of robustness with strong guarantees.
arXiv Detail & Related papers (2024-07-10T09:13:11Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - CELA: Cost-Efficient Language Model Alignment for CTR Prediction [70.65910069412944]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.
Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)
We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z) - Interval Abstractions for Robust Counterfactual Explanations [15.954944873701503]
Counterfactual Explanations (CEs) have emerged as a major paradigm in explainable AI research.
Existing methods often become invalid when slight changes occur in the parameters of the model they were generated for.
We propose a novel interval abstraction technique for machine learning models, which allows us to obtain provable robustness guarantees.
arXiv Detail & Related papers (2024-04-21T18:24:34Z) - Verified Training for Counterfactual Explanation Robustness under Data
Shift [18.156341188646348]
Counterfactual explanations (CEs) enhance the interpretability of machine learning models by describing what changes to an input are necessary to change its prediction to a desired class.
Existing approaches generate CEs by focusing on a single, fixed model, and do not provide any formal guarantees on the CEs' future validity.
This paper introduces VeriTraCER, an approach that jointly trains a classifier and an explainer to explicitly consider the robustness of the generated CEs to small model shifts.
arXiv Detail & Related papers (2024-03-06T15:06:16Z) - Introducing User Feedback-based Counterfactual Explanations (UFCE) [49.1574468325115]
Counterfactual explanations (CEs) have emerged as a viable solution for generating comprehensible explanations in XAI.
UFCE allows for the inclusion of user constraints to determine the smallest modifications in the subset of actionable features.
UFCE outperforms two well-known CE methods in terms of textitproximity, textitsparsity, and textitfeasibility.
arXiv Detail & Related papers (2024-02-26T20:09:44Z) - Provably Robust and Plausible Counterfactual Explanations for Neural Networks via Robust Optimisation [19.065904250532995]
We propose Provably RObust and PLAusible Counterfactual Explanations (PROPLACE)
We formulate an iterative algorithm to compute provably robust CEs and prove its convergence, soundness and completeness.
We show that PROPLACE achieves state-of-the-art performances against metrics on three evaluation aspects.
arXiv Detail & Related papers (2023-09-22T00:12:09Z) - Diffusion-based Visual Counterfactual Explanations -- Towards Systematic
Quantitative Evaluation [64.0476282000118]
Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality.
It is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies.
We propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used.
arXiv Detail & Related papers (2023-08-11T12:22:37Z) - REX: Rapid Exploration and eXploitation for AI Agents [103.68453326880456]
We propose an enhanced approach for Rapid Exploration and eXploitation for AI Agents called REX.
REX introduces an additional layer of rewards and integrates concepts similar to Upper Confidence Bound (UCB) scores, leading to more robust and efficient AI agent performance.
arXiv Detail & Related papers (2023-07-18T04:26:33Z) - Generating Interpretable Counterfactual Explanations By Implicit
Minimisation of Epistemic and Aleatoric Uncertainties [24.410724285492485]
We introduce a simple and fast method for generating interpretable CEs in a white-box setting without an auxiliary model.
Our experiments show that our proposed algorithm generates more interpretable CEs, according to IM1 scores, than existing methods.
arXiv Detail & Related papers (2021-03-16T10:20:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.