Related papers: On Explaining Decision Trees

On Explaining Decision Trees

URL: http://arxiv.org/abs/2010.11034v1
Date: Wed, 21 Oct 2020 14:33:53 GMT
Title: On Explaining Decision Trees
Authors: Yacine Izza, Alexey Ignatiev, and Joao Marques-Silva
Abstract summary: Decision trees (DTs) epitomize what have become to be known as interpretable machine learning (ML) models. This paper shows that in some settings DTs can hardly be deemed interpretable, with paths in a DT being arbitrarily larger than a PI-explanation.
Score: 17.646704122091087
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Decision trees (DTs) epitomize what have become to be known as interpretable machine learning (ML) models. This is informally motivated by paths in DTs being often much smaller than the total number of features. This paper shows that in some settings DTs can hardly be deemed interpretable, with paths in a DT being arbitrarily larger than a PI-explanation, i.e. a subset-minimal set of feature values that entails the prediction. As a result, the paper proposes a novel model for computing PI-explanations of DTs, which enables computing one PI-explanation in polynomial time. Moreover, it is shown that enumeration of PI-explanations can be reduced to the enumeration of minimal hitting sets. Experimental results were obtained on a wide range of publicly available datasets with well-known DT-learning tools, and confirm that in most cases DTs have paths that are proper supersets of PI-explanations.

Related papers

Randomness of Low-Layer Parameters Determines Confusing Samples in Terms of Interaction Representations of a DNN [67.80700786901016]
We find that the complexity of interactions encoded by a deep neural network (DNN) can explain its generalization power. We also discover that the confusing samples of a DNN, which are represented by non-generalizable interactions, are determined by its low-layer parameters.
arXiv Detail & Related papers (2025-02-12T18:25:13Z)
Vanilla Gradient Descent for Oblique Decision Trees [7.236325471627686]
We propose a novel encoding for (hard, oblique) DTs as Neural Networks (NNs) Experiments show oblique DTs learned using DTSemNet are more accurate than oblique DTs of similar size learned using state-of-the-art techniques. We also experimentally demonstrate that DTSemNet can learn DT policies as efficiently as NN policies in the Reinforcement Learning (RL) setup with physical inputs.
arXiv Detail & Related papers (2024-08-17T08:18:40Z)
Bayesian Prediction-Powered Inference [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. We propose a framework for PPI based on Bayesian inference that allows researchers to develop new task-appropriate PPI methods easily.
arXiv Detail & Related papers (2024-05-09T18:08:58Z)
iPINNs: Incremental learning for Physics-informed neural networks [66.4795381419701]
Physics-informed neural networks (PINNs) have recently become a powerful tool for solving partial differential equations (PDEs) We propose incremental PINNs that can learn multiple tasks sequentially without additional parameters for new tasks and improve performance for every equation in the sequence. Our approach learns multiple PDEs starting from the simplest one by creating its own subnetwork for each PDE and allowing each subnetwork to overlap with previously learnedworks.
arXiv Detail & Related papers (2023-04-10T20:19:20Z)
When does Privileged Information Explain Away Label Noise? [66.9725683097357]
We investigate the role played by different properties of the PI in explaining away label noise. We find that PI is most helpful when it allows networks to easily distinguish clean from noisy data. We propose several enhancements to the state-of-the-art PI methods and demonstrate the potential of PI as a means of tackling label noise.
arXiv Detail & Related papers (2023-03-03T09:25:39Z)
On Tackling Explanation Redundancy in Decision Trees [19.833126971063724]
Decision trees (DTs) epitomize the ideal of interpretability of machine learning (ML) models. This paper offers both theoretical and experimental arguments demonstrating that, as long as interpretability of decision trees equates with succinctness of explanations, then decision trees ought not to be deemed interpretable.
arXiv Detail & Related papers (2022-05-20T05:33:38Z)
Provably Precise, Succinct and Efficient Explanations for Decision Trees [32.062312674333775]
Decision trees (DTs) embody interpretable classifiers. Work has demonstrated that predictions in DTs ought to be explained with rigorous explanations. delta-relevant sets denote that are succinct and provably precise.
arXiv Detail & Related papers (2022-05-19T13:54:52Z)
Scalable Sampling for Nonsymmetric Determinantal Point Processes [47.18450267217498]
A determinantal point process (DPP) on a collection of $M$ items is a model, parameterized by a symmetric kernel matrix. Recent work shows that removing the kernel symmetry constraint, yielding nonsymmetric DPPs (NDPPs), can lead to significant predictive performance gains for machine learning applications. There is only one known DPP sampling algorithm, based on Cholesky decomposition, that can directly apply to NDPPs as well. We show that imposing certain structural constraints on the NDPP kernel enables us to bound the rejection rate in a way that depends only on the kernel rank.
arXiv Detail & Related papers (2022-01-20T19:21:59Z)
On Efficiently Explaining Graph-Based Classifiers [16.199563506727316]
This paper shows that not only decision trees (DTs) may not be interpretable but also proposed a-time algorithm for computing one PI-explanation of a DT. In addition, the paper also proposes a-time algorithm for computing one contrastive explanation.
arXiv Detail & Related papers (2021-06-02T17:55:41Z)
On Explaining Random Forests with SAT [3.5408022972081685]
Random Forests (RFs) are among the most widely used Machine Learning (ML) classifiers. RFs are not interpretable, but there are no dedicated non-heuristic approaches for computing explanations of RFs. This paper proposes a propositional encoding for contrasts computing explanations of RFs, thus enabling finding PI-explanations with a SAT solver.
arXiv Detail & Related papers (2021-05-21T11:05:14Z)
Learning Reasoning Strategies in End-to-End Differentiable Proving [50.9791149533921]
Conditional Theorem Provers learn optimal rule selection strategy via gradient-based optimisation. We show that Conditional Theorem Provers are scalable and yield state-of-the-art results on the CLUTRR dataset.
arXiv Detail & Related papers (2020-07-13T16:22:14Z)
Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method. PCL implicitly encodes semantic structures of the data into the learned embedding space. PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.