Rule Covering for Interpretation and Boosting
- URL: http://arxiv.org/abs/2007.06379v2
- Date: Sat, 19 Sep 2020 13:56:58 GMT
- Title: Rule Covering for Interpretation and Boosting
- Authors: S. Ilker Birbil, Mert Edali, Birol Yuceoglu
- Abstract summary: We propose two algorithms for interpretation and boosting of tree-based ensemble methods.
First algorithm uses the collection of decision trees obtained from a trained random forest model.
Second algorithm uses a rule generation scheme for boosting decision trees.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose two algorithms for interpretation and boosting of tree-based
ensemble methods. Both algorithms make use of mathematical programming models
that are constructed with a set of rules extracted from an ensemble of decision
trees. The objective is to obtain the minimum total impurity with the least
number of rules that cover all the samples. The first algorithm uses the
collection of decision trees obtained from a trained random forest model. Our
numerical results show that the proposed rule covering approach selects only a
few rules that could be used for interpreting the random forest model.
Moreover, the resulting set of rules closely matches the accuracy level of the
random forest model. Inspired by the column generation algorithm in linear
programming, our second algorithm uses a rule generation scheme for boosting
decision trees. We use the dual optimal solutions of the linear programming
models as sample weights to obtain only those rules that would improve the
accuracy. With a computational study, we observe that our second algorithm
performs competitively with the other well-known boosting methods. Our
implementations also demonstrate that both algorithms can be trivially coupled
with the existing random forest and decision tree packages.
Related papers
- Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models [70.07661254213181]
We propose two principled algorithms for the test-time compute of large language models.
We prove theoretically that the failure probability of one algorithm decays to zero exponentially as its test-time compute grows.
arXiv Detail & Related papers (2024-11-29T05:29:47Z) - Bregman-divergence-based Arimoto-Blahut algorithm [53.64687146666141]
We generalize the Arimoto-Blahut algorithm to a general function defined over Bregman-divergence system.
We propose a convex-optimization-free algorithm that can be applied to classical and quantum rate-distortion theory.
arXiv Detail & Related papers (2024-08-10T06:16:24Z) - Learning Decision Trees and Forests with Algorithmic Recourse [11.401006371457436]
Algorithmic Recourse (AR) aims to provide a recourse action for altering the undesired prediction result given by a model.
We formulate the task of learning an accurate classification tree under the constraint of ensuring the existence of reasonable actions for as many instances as possible.
arXiv Detail & Related papers (2024-06-03T08:33:42Z) - Optimal estimation of Gaussian (poly)trees [25.02920605955238]
We consider both problems of distribution learning (i.e. in KL distance) and structure learning (i.e. exact recovery)
The first approach is based on the Chow-Liu algorithm, and learns an optimal tree-structured distribution efficiently.
The second approach is a modification of the PC algorithm for polytrees that uses partial correlation as a conditional independence tester for constraint-based structure learning.
arXiv Detail & Related papers (2024-02-09T12:58:36Z) - Learning the Positions in CountSketch [49.57951567374372]
We consider sketching algorithms which first compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem.
In this work, we propose the first learning-based algorithms that also optimize the locations of the non-zero entries.
arXiv Detail & Related papers (2023-06-11T07:28:35Z) - Explaining random forest prediction through diverse rulesets [0.0]
Local Tree eXtractor (LTreeX) is able to explain the forest prediction for a given test instance with a few diverse rules.
We show that our proposed approach substantially outperforms other explainable methods in terms of predictive performance.
arXiv Detail & Related papers (2022-03-29T12:54:57Z) - E2E-FS: An End-to-End Feature Selection Method for Neural Networks [0.3222802562733786]
We present a novel selection algorithm, called EndtoEnd Feature Selection (E2FS)
Our algorithm, similar to the lasso approach, is solved with gradient descent techniques.
Although hard restrictions, experimental results show that this algorithm can be used with any learning model.
arXiv Detail & Related papers (2020-12-14T16:19:25Z) - Efficient Computation of Expectations under Spanning Tree Distributions [67.71280539312536]
We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models.
Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms.
arXiv Detail & Related papers (2020-08-29T14:58:26Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Model Selection in Contextual Stochastic Bandit Problems [51.94632035240787]
We develop a meta-algorithm that selects between base algorithms.
We show through a lower bound that even when one of the base algorithms has $O(sqrtT)$ regret, in general it is impossible to get better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2020-03-03T18:46:34Z) - Boosting Algorithms for Estimating Optimal Individualized Treatment
Rules [4.898659895355356]
We present nonparametric algorithms for estimating optimal individualized treatment rules.
The proposed algorithms are based on the XGBoost algorithm, which is known as one of the most powerful algorithms in the machine learning literature.
arXiv Detail & Related papers (2020-01-31T22:26:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.