Related papers: Interval Abstractions for Robust Counterfactual Explanations

Interval Abstractions for Robust Counterfactual Explanations

URL: http://arxiv.org/abs/2404.13736v1
Date: Sun, 21 Apr 2024 18:24:34 GMT
Title: Interval Abstractions for Robust Counterfactual Explanations
Authors: Junqi Jiang, Francesco Leofante, Antonio Rago, Francesca Toni,
Abstract summary: We propose a novel interval abstraction technique for parametric machine learning models. We obtain provable robustness guarantees of CEs under the possibly infinite set of plausible model changes $Delta$.
Score: 15.954944873701503
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Counterfactual Explanations (CEs) have emerged as a major paradigm in explainable AI research, providing recourse recommendations for users affected by the decisions of machine learning models. However, when slight changes occur in the parameters of the underlying model, CEs found by existing methods often become invalid for the updated models. The literature lacks a way to certify deterministic robustness guarantees for CEs under model changes, in that existing methods to improve CEs' robustness are heuristic, and the robustness performances are evaluated empirically using only a limited number of retrained models. To bridge this gap, we propose a novel interval abstraction technique for parametric machine learning models, which allows us to obtain provable robustness guarantees of CEs under the possibly infinite set of plausible model changes $\Delta$. We formalise our robustness notion as the $\Delta$-robustness for CEs, in both binary and multi-class classification settings. We formulate procedures to verify $\Delta$-robustness based on Mixed Integer Linear Programming, using which we further propose two algorithms to generate CEs that are $\Delta$-robust. In an extensive empirical study, we demonstrate how our approach can be used in practice by discussing two strategies for determining the appropriate hyperparameter in our method, and we quantitatively benchmark the CEs generated by eleven methods, highlighting the effectiveness of our algorithms in finding robust CEs.

Related papers

EKPC: Elastic Knowledge Preservation and Compensation for Class-Incremental Learning [53.88000987041739]
Class-Incremental Learning (CIL) aims to enable AI models to continuously learn from sequentially arriving data of different classes over time.<n>We propose the Elastic Knowledge Preservation and Compensation (EKPC) method, integrating Importance-aware importance Regularization (IPR) and Trainable Semantic Drift Compensation (TSDC) for CIL.
arXiv Detail & Related papers (2025-06-14T05:19:58Z)
RobustX: Robust Counterfactual Explanations Made Easy [4.875355171029671]
This paper introduces RobustX, an open-source Python library implementing a collection of CE generation and evaluation methods. It provides interfaces to several existing methods, enabling streamlined access to state-of-the-art techniques.
arXiv Detail & Related papers (2025-02-19T14:12:01Z)
Nested Annealed Training Scheme for Generative Adversarial Networks [54.70743279423088]
This paper focuses on a rigorous mathematical theoretical framework: the composite-functional-gradient GAN (CFG) We reveal the theoretical connection between the CFG model and score-based models. We find that the training objective of the CFG discriminator is equivalent to finding an optimal D(x)
arXiv Detail & Related papers (2025-01-20T07:44:09Z)
Rigorous Probabilistic Guarantees for Robust Counterfactual Explanations [80.86128012438834]
We show for the first time that computing the robustness of counterfactuals with respect to plausible model shifts is NP-complete. We propose a novel probabilistic approach which is able to provide tight estimates of robustness with strong guarantees.
arXiv Detail & Related papers (2024-07-10T09:13:11Z)
DistiLLM: Towards Streamlined Distillation for Large Language Models [53.46759297929675]
DistiLLM is a more effective and efficient KD framework for auto-regressive language models. DisiLLM comprises two components: (1) a novel skew Kullback-Leibler divergence loss, where we unveil and leverage its theoretical properties, and (2) an adaptive off-policy approach designed to enhance the efficiency in utilizing student-generated outputs.
arXiv Detail & Related papers (2024-02-06T11:10:35Z)
Provably Robust and Plausible Counterfactual Explanations for Neural Networks via Robust Optimisation [19.065904250532995]
We propose Provably RObust and PLAusible Counterfactual Explanations (PROPLACE) We formulate an iterative algorithm to compute provably robust CEs and prove its convergence, soundness and completeness. We show that PROPLACE achieves state-of-the-art performances against metrics on three evaluation aspects.
arXiv Detail & Related papers (2023-09-22T00:12:09Z)
Counterfactual Explanation via Search in Gaussian Mixture Distributed Latent Space [19.312306559210125]
Counterfactual Explanations (CEs) are an important tool in Algorithmic Recourse for addressing two questions. guiding the user's interaction with AI systems by proposing easy-to-understand explanations is essential for the trustworthy adoption and long-term acceptance of AI systems. We introduce a new method to generate CEs for a pre-trained binary classifier by first shaping the latent space of an autoencoder to be a mixture of Gaussian distributions.
arXiv Detail & Related papers (2023-07-25T10:21:26Z)
Finding Regions of Counterfactual Explanations via Robust Optimization [0.0]
A counterfactual explanation (CE) is a minimal perturbed data point for which the decision of the model changes. Most of the existing methods can only provide one CE, which may not be achievable for the user. We derive an iterative method to calculate robust CEs that remain valid even after the features are slightly perturbed.
arXiv Detail & Related papers (2023-01-26T14:06:26Z)
Formalising the Robustness of Counterfactual Explanations for Neural Networks [16.39168719476438]
We introduce an abstraction framework based on interval neural networks to verify the robustness of CFXs. We show how embedding Delta-robustness within existing methods can provide CFXs which are provably robust.
arXiv Detail & Related papers (2022-08-31T14:11:23Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Weakly Supervised Semantic Segmentation via Alternative Self-Dual Teaching [82.71578668091914]
This paper establishes a compact learning framework that embeds the classification and mask-refinement components into a unified deep model. We propose a novel alternative self-dual teaching (ASDT) mechanism to encourage high-quality knowledge interaction.
arXiv Detail & Related papers (2021-12-17T11:56:56Z)
A generalized framework for active learning reliability: survey and benchmark [0.0]
We propose a modular framework to build on-the-fly efficient active learning strategies. We devise 39 strategies for the solution of 20 reliability benchmark problems.
arXiv Detail & Related papers (2021-06-03T09:33:59Z)
Automated and Formal Synthesis of Neural Barrier Certificates for Dynamical Models [70.70479436076238]
We introduce an automated, formal, counterexample-based approach to synthesise Barrier Certificates (BC) The approach is underpinned by an inductive framework, which manipulates a candidate BC structured as a neural network, and a sound verifier, which either certifies the candidate's validity or generates counter-examples. The outcomes show that we can synthesise sound BCs up to two orders of magnitude faster, with in particular a stark speedup on the verification engine.
arXiv Detail & Related papers (2020-07-07T07:39:42Z)
An Online Method for A Class of Distributionally Robust Optimization with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems. Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.