On Computing Probabilistic Explanations for Decision Trees
- URL: http://arxiv.org/abs/2207.12213v1
- Date: Thu, 30 Jun 2022 21:58:31 GMT
- Title: On Computing Probabilistic Explanations for Decision Trees
- Authors: Marcelo Arenas, Pablo Barcel\'o, Miguel Romero, Bernardo Subercaseaux
- Abstract summary: "sufficient reasons" are a kind of explanation in which given a decision tree $T$ and an instance $x$, one explains the decision $T(x)$.
Our paper settles the computational complexity of $delta$-sufficient-reasons over decision trees.
We identify structural restrictions of decision trees that make the problem tractable, and show how SAT solvers might be able to tackle these problems in practical settings.
- Score: 4.406418914680962
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Formal XAI (explainable AI) is a growing area that focuses on computing
explanations with mathematical guarantees for the decisions made by ML models.
Inside formal XAI, one of the most studied cases is that of explaining the
choices taken by decision trees, as they are traditionally deemed as one of the
most interpretable classes of models. Recent work has focused on studying the
computation of "sufficient reasons", a kind of explanation in which given a
decision tree $T$ and an instance $x$, one explains the decision $T(x)$ by
providing a subset $y$ of the features of $x$ such that for any other instance
$z$ compatible with $y$, it holds that $T(z) = T(x)$, intuitively meaning that
the features in $y$ are already enough to fully justify the classification of
$x$ by $T$. It has been argued, however, that sufficient reasons constitute a
restrictive notion of explanation, and thus the community has started to study
their probabilistic counterpart, in which one requires that the probability of
$T(z) = T(x)$ must be at least some value $\delta \in (0, 1]$, where $z$ is a
random instance that is compatible with $y$. Our paper settles the
computational complexity of $\delta$-sufficient-reasons over decision trees,
showing that both (1) finding $\delta$-sufficient-reasons that are minimal in
size, and (2) finding $\delta$-sufficient-reasons that are minimal
inclusion-wise, do not admit polynomial-time algorithms (unless P=NP). This is
in stark contrast with the deterministic case ($\delta = 1$) where
inclusion-wise minimal sufficient-reasons are easy to compute. By doing this,
we answer two open problems originally raised by Izza et al. On the positive
side, we identify structural restrictions of decision trees that make the
problem tractable, and show how SAT solvers might be able to tackle these
problems in practical settings.
Related papers
- Fast Rates for Bandit PAC Multiclass Classification [73.17969992976501]
We study multiclass PAC learning with bandit feedback, where inputs are classified into one of $K$ possible labels and feedback is limited to whether or not the predicted labels are correct.
Our main contribution is in designing a novel learning algorithm for the agnostic $(varepsilon,delta)$PAC version of the problem.
arXiv Detail & Related papers (2024-06-18T08:54:04Z) - A Theory of Interpretable Approximations [61.90216959710842]
We study the idea of approximating a target concept $c$ by a small aggregation of concepts from some base class $mathcalH$.
For any given pair of $mathcalH$ and $c$, exactly one of these cases holds: (i) $c$ cannot be approximated by $mathcalH$ with arbitrary accuracy.
We show that, in the case of interpretable approximations, even a slightly nontrivial a-priori guarantee on the complexity of approximations implies approximations with constant (distribution-free and accuracy-
arXiv Detail & Related papers (2024-06-15T06:43:45Z) - On the Trade-off between the Number of Nodes and the Number of Trees in
a Random Forest [8.340919965932443]
We show that the majority function of $n$ variables can be represented by a bag of $T$ ($ n$) decision trees each with size if $n-T$ is a constant.
A related result on the $k$-out-of-$n$ functions is presented too.
arXiv Detail & Related papers (2023-12-16T02:09:34Z) - Superpolynomial Lower Bounds for Decision Tree Learning and Testing [5.117810469621253]
We prove, under randomized ETH, superpolynomial lower bounds for two basic problems.
Our results imply new lower bounds for distribution-free PAC learning and testing of decision trees.
We show our lower bound for learning decision trees can be improved to $nOmega(log s)$.
arXiv Detail & Related papers (2022-10-12T16:25:48Z) - Agnostic learning with unknown utilities [70.14742836006042]
In many real-world problems, the utility of a decision depends on the underlying context $x$ and decision $y$.
We study this as agnostic learning with unknown utilities.
We show that estimating the utilities of only the sampled points$S$ suffices to learn a decision function which generalizes well.
arXiv Detail & Related papers (2021-04-17T08:22:04Z) - Universal guarantees for decision tree induction via a higher-order
splitting criterion [16.832966312395126]
Our algorithm achieves provable guarantees for all target functions $f: -1,1n to -1,1$ with respect to the uniform distribution.
The crux of our extension is a new splitting criterion that takes into account the correlations between $f$ and small subsets of its attributes.
Our algorithm satisfies the following guarantee: for all target functions $f : -1,1n to -1,1$, sizes $sin mathbbN$, and error parameters $epsilon$, it constructs a decision
arXiv Detail & Related papers (2020-10-16T21:20:45Z) - An Algorithm for Learning Smaller Representations of Models With Scarce
Data [0.0]
We present a greedy algorithm for solving binary classification problems in situations where the dataset is too small or not fully representative.
It relies on a trained model with loose accuracy constraints, an iterative hyperparameter pruning procedure, and a function used to generate new data.
arXiv Detail & Related papers (2020-10-15T19:17:51Z) - An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - Agnostic Q-learning with Function Approximation in Deterministic
Systems: Tight Bounds on Approximation Error and Sample Complexity [94.37110094442136]
We study the problem of agnostic $Q$-learning with function approximation in deterministic systems.
We show that if $delta = Oleft(rho/sqrtdim_Eright)$, then one can find the optimal policy using $Oleft(dim_Eright)$.
arXiv Detail & Related papers (2020-02-17T18:41:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.