On Explaining Random Forests with SAT
- URL: http://arxiv.org/abs/2105.10278v1
- Date: Fri, 21 May 2021 11:05:14 GMT
- Title: On Explaining Random Forests with SAT
- Authors: Yacine Izza and Joao Marques-Silva
- Abstract summary: Random Forests (RFs) are among the most widely used Machine Learning (ML) classifiers.
RFs are not interpretable, but there are no dedicated non-heuristic approaches for computing explanations of RFs.
This paper proposes a propositional encoding for contrasts computing explanations of RFs, thus enabling finding PI-explanations with a SAT solver.
- Score: 3.5408022972081685
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Random Forest (RFs) are among the most widely used Machine Learning (ML)
classifiers. Even though RFs are not interpretable, there are no dedicated
non-heuristic approaches for computing explanations of RFs. Moreover, there is
recent work on polynomial algorithms for explaining ML models, including naive
Bayes classifiers. Hence, one question is whether finding explanations of RFs
can be solved in polynomial time. This paper answers this question negatively,
by proving that computing one PI-explanation of an RF is D^P-complete.
Furthermore, the paper proposes a propositional encoding for computing
explanations of RFs, thus enabling finding PI-explanations with a SAT solver.
This contrasts with earlier work on explaining boosted trees (BTs) and neural
networks (NNs), which requires encodings based on SMT/MILP. Experimental
results, obtained on a wide range of publicly available datasets, demontrate
that the proposed SAT-based approach scales to RFs of sizes common in practical
applications. Perhaps more importantly, the experimental results demonstrate
that, for the vast majority of examples considered, the SAT-based approach
proposed in this paper significantly outperforms existing heuristic approaches.
Related papers
- A survey and taxonomy of methods interpreting random forest models [0.0]
The interpretability of random forest (RF) models is a research topic of growing interest in the machine learning (ML) community.
RF resulting model is regarded as a "black box" because of its numerous deep decision trees.
This paper aims to provide an extensive review of methods used in the literature to interpret RF resulting models.
arXiv Detail & Related papers (2024-07-17T17:33:32Z) - VQ-T: RNN Transducers using Vector-Quantized Prediction Network States [52.48566999668521]
We propose to use vector-quantized long short-term memory units in the prediction network of RNN transducers.
By training the discrete representation jointly with the ASR network, hypotheses can be actively merged for lattice generation.
Our experiments on the Switchboard corpus show that the proposed VQ RNN transducers improve ASR performance over transducers with regular prediction networks.
arXiv Detail & Related papers (2022-08-03T02:45:52Z) - Linear Temporal Logic Modulo Theories over Finite Traces (Extended
Version) [72.38188258853155]
This paper studies Linear Temporal Logic over Finite Traces (LTLf)
proposition letters are replaced with first-order formulas interpreted over arbitrary theories.
The resulting logic, called Satisfiability Modulo Theories (LTLfMT), is semi-decidable.
arXiv Detail & Related papers (2022-04-28T17:57:33Z) - A Generalizable Model-and-Data Driven Approach for Open-Set RFF
Authentication [74.63333951647581]
Radio-frequency fingerprints(RFFs) are promising solutions for realizing low-cost physical layer authentication.
Machine learning-based methods have been proposed for RFF extraction and discrimination.
We propose a new end-to-end deep learning framework for extracting RFFs from raw received signals.
arXiv Detail & Related papers (2021-08-10T03:59:37Z) - On Efficiently Explaining Graph-Based Classifiers [16.199563506727316]
This paper shows that not only decision trees (DTs) may not be interpretable but also proposed a-time algorithm for computing one PI-explanation of a DT.
In addition, the paper also proposes a-time algorithm for computing one contrastive explanation.
arXiv Detail & Related papers (2021-06-02T17:55:41Z) - Efficient semidefinite-programming-based inference for binary and
multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF.
We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z) - AIN: Fast and Accurate Sequence Labeling with Approximate Inference
Network [75.44925576268052]
The linear-chain Conditional Random Field (CRF) model is one of the most widely-used neural sequence labeling approaches.
Exact probabilistic inference algorithms are typically applied in training and prediction stages of the CRF model.
We propose to employ a parallelizable approximate variational inference algorithm for the CRF model.
arXiv Detail & Related papers (2020-09-17T12:18:43Z) - Learning Reasoning Strategies in End-to-End Differentiable Proving [50.9791149533921]
Conditional Theorem Provers learn optimal rule selection strategy via gradient-based optimisation.
We show that Conditional Theorem Provers are scalable and yield state-of-the-art results on the CLUTRR dataset.
arXiv Detail & Related papers (2020-07-13T16:22:14Z) - Random Partitioning Forest for Point-Wise and Collective Anomaly
Detection -- Application to Intrusion Detection [9.74672460306765]
DiFF-RF is an ensemble approach composed of random partitioning binary trees to detect anomalies.
Our experiments show that DiFF-RF almost systematically outperforms the isolation forest (IF) algorithm.
Our experience shows that DiFF-RF can work well in the presence of small-scale learning data.
arXiv Detail & Related papers (2020-06-29T10:44:08Z) - Interpretation and Simplification of Deep Forest [4.576379639081977]
We consider quantifying the feature contributions and frequency of the fully trained deep RF in the form of a decision rule set.
Model simplification is achieved by eliminating unnecessary rules by measuring the feature contributions.
Experiment results have shown that a feature contribution analysis allows a black box model to be decomposed for quantitatively interpreting a rule set.
arXiv Detail & Related papers (2020-01-14T11:30:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.