Related papers: Interpretable Random Forests via Rule Extraction

Interpretable Random Forests via Rule Extraction

URL: http://arxiv.org/abs/2004.14841v4
Date: Wed, 10 Feb 2021 09:09:31 GMT
Title: Interpretable Random Forests via Rule Extraction
Authors: Cl\'ement B\'enard (LPSM (UMR\_8001)), G\'erard Biau (LSTA), S\'ebastien da Veiga, Erwan Scornet (CMAP)
Abstract summary: We introduce SIRUS, a stable rule learning algorithm which takes the form of a short and simple list of rules. Our R/C++ software implementation sirus is available from CRAN.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce SIRUS (Stable and Interpretable RUle Set) for regression, a stable rule learning algorithm which takes the form of a short and simple list of rules. State-of-the-art learning algorithms are often referred to as "black boxes" because of the high number of operations involved in their prediction process. Despite their powerful predictivity, this lack of interpretability may be highly restrictive for applications with critical decisions at stake. On the other hand, algorithms with a simple structure-typically decision trees, rule algorithms, or sparse linear models-are well known for their instability. This undesirable feature makes the conclusions of the data analysis unreliable and turns out to be a strong operational limitation. This motivates the design of SIRUS, which combines a simple structure with a remarkable stable behavior when data is perturbed. The algorithm is based on random forests, the predictive accuracy of which is preserved. We demonstrate the efficiency of the method both empirically (through experiments) and theoretically (with the proof of its asymptotic stability). Our R/C++ software implementation sirus is available from CRAN.

Related papers

Efficient distributional regression trees learning algorithms for calibrated non-parametric probabilistic forecasts [0.0]
In the context of regression, instead of estimating a conditional mean, this can be achieved by producing a predictive interval for the output. This paper introduces novel algorithms for learning probabilistic regression trees for the WIS or CRPS loss functions.
arXiv Detail & Related papers (2025-02-07T18:39:35Z)
UCB algorithms for multi-armed bandits: Precise regret and adaptive inference [6.349503549199403]
Upper Confidence Bound (UCB) algorithms are a widely-used class of sequential algorithms for the $K$-armed bandit problem. We show that the Lai-Robbins regret formula is exact if and only if the sub-optimality gaps exceed the order $sigmasqrtKlog T/T$. We also show that its maximal regret deviates from the minimax regret by a logarithmic factor, and therefore settling its strict minimax optimality in the negative.
arXiv Detail & Related papers (2024-12-09T01:14:02Z)
Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Efficiently Learning Probabilistic Logical Models by Cheaply Ranking Mined Rules [9.303501974597548]
We introduce precision and recall for logical rules and define their composition as rule utility. We introduce SPECTRUM, a scalable framework for learning logical models from relational data.
arXiv Detail & Related papers (2024-09-24T16:54:12Z)
Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics [39.07258580928359]
We study computationally and statistically efficient Reinforcement Learning algorithms for the linear Bellman Complete setting. This setting uses linear function approximation to capture value functions and unifies existing models like linear Markov Decision Processes (MDP) and Linear Quadratic Regulators (LQR) Our work provides a computationally efficient algorithm for the linear Bellman complete setting that works for MDPs with large action spaces, random initial states, and random rewards but relies on the underlying dynamics to be deterministic.
arXiv Detail & Related papers (2024-06-17T17:52:38Z)
Fuzzy Fault Trees Formalized [1.640922391885265]
Fuzzy logic is a popular framework for dealing with ambiguous values. In this paper, we define a rigorous framework for fuzzy unreliability values. We also provide a bottom-up algorithm to efficiently calculate fuzzy reliability for a system.
arXiv Detail & Related papers (2024-03-13T14:45:54Z)
Adaptive Experimentation at Scale: A Computational Framework for Flexible Batches [7.390918770007728]
Motivated by practical instances involving a handful of reallocations in which outcomes are measured in batches, we develop an adaptive-driven experimentation framework. Our main observation is that normal approximations, which are universal in statistical inference, can also guide the design of adaptive algorithms.
arXiv Detail & Related papers (2023-03-21T04:17:03Z)
Non-Clairvoyant Scheduling with Predictions Revisited [77.86290991564829]
In non-clairvoyant scheduling, the task is to find an online strategy for scheduling jobs with a priori unknown processing requirements. We revisit this well-studied problem in a recently popular learning-augmented setting that integrates (untrusted) predictions in algorithm design. We show that these predictions have desired properties, admit a natural error measure as well as algorithms with strong performance guarantees.
arXiv Detail & Related papers (2022-02-21T13:18:11Z)
Robustification of Online Graph Exploration Methods [59.50307752165016]
We study a learning-augmented variant of the classical, notoriously hard online graph exploration problem. We propose an algorithm that naturally integrates predictions into the well-known Nearest Neighbor (NN) algorithm.
arXiv Detail & Related papers (2021-12-10T10:02:31Z)
Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination [82.52105963476703]
A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise. First-order guarantees are relatively well understood in statistical and online learning. We show that the logarithmic loss and an information-theoretic quantity called the triangular discrimination play a fundamental role in obtaining first-order guarantees.
arXiv Detail & Related papers (2021-07-05T19:20:34Z)
Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting. We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z)
Bayes DistNet -- A Robust Neural Network for Algorithm Runtime Distribution Predictions [1.8275108630751844]
Randomized algorithms are used in many state-of-the-art solvers for constraint satisfaction problems (CSP) and Boolean satisfiability (SAT) problems. Previous state-of-the-art methods directly try to predict a fixed parametric distribution that the input instance follows. This new model achieves robust predictive performance in the low observation setting, as well as handling censored observations.
arXiv Detail & Related papers (2020-12-14T01:15:39Z)
Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration [143.43658264904863]
We show how iteration under a more standard notion of low inherent Bellman error, typically employed in least-square value-style algorithms, can provide strong PAC guarantees on learning a near optimal value function. We present a computationally tractable algorithm for the reward-free setting and show how it can be used to learn a near optimal policy for any (linear) reward function.
arXiv Detail & Related papers (2020-08-18T04:34:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.