FIRE: An Optimization Approach for Fast Interpretable Rule Extraction
- URL: http://arxiv.org/abs/2306.07432v1
- Date: Mon, 12 Jun 2023 21:27:39 GMT
- Title: FIRE: An Optimization Approach for Fast Interpretable Rule Extraction
- Authors: Brian Liu and Rahul Mazumder
- Abstract summary: We present FIRE, Fast Interpretable Rule Extraction, an optimization-based framework to extract a small collection of decision rules from tree ensembles.
We show in our experiments that FIRE outperforms state-of-the-art ensemble algorithms at building sparse rule sets, and can deliver more interpretable models compared to existing methods.
- Score: 7.538482310185135
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present FIRE, Fast Interpretable Rule Extraction, an optimization-based
framework to extract a small but useful collection of decision rules from tree
ensembles. FIRE selects sparse representative subsets of rules from tree
ensembles, that are easy for a practitioner to examine. To further enhance the
interpretability of the extracted model, FIRE encourages fusing rules during
selection, so that many of the selected decision rules share common
antecedents. The optimization framework utilizes a fusion regularization
penalty to accomplish this, along with a non-convex sparsity-inducing penalty
to aggressively select rules. Optimization problems in FIRE pose a challenge to
off-the-shelf solvers due to problem scale and the non-convexity of the
penalties. To address this, making use of problem-structure, we develop a
specialized solver based on block coordinate descent principles; our solver
performs up to 40x faster than existing solvers. We show in our experiments
that FIRE outperforms state-of-the-art rule ensemble algorithms at building
sparse rule sets, and can deliver more interpretable models compared to
existing methods.
Related papers
- LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path.
The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z) - Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.
To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.
Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z) - Obtaining Explainable Classification Models using Distributionally
Robust Optimization [12.511155426574563]
We study generalized linear models constructed using sets of feature value rules.
An inherent trade-off exists between rule set sparsity and its prediction accuracy.
We propose a new formulation to learn an ensemble of rule sets that simultaneously addresses these competing factors.
arXiv Detail & Related papers (2023-11-03T15:45:34Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Learning Interpretable Decision Rule Sets: A Submodular Optimization
Approach [12.710158664288784]
We consider a submodular optimization based approach for learning rule sets.
We employ an objective function that exhibits submodularity and thus is amenable to submodular optimization techniques.
arXiv Detail & Related papers (2022-06-08T07:41:47Z) - Fast Feature Selection with Fairness Constraints [49.142308856826396]
We study the fundamental problem of selecting optimal features for model construction.
This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants.
We extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions.
The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work.
arXiv Detail & Related papers (2022-02-28T12:26:47Z) - Learning Proximal Operators to Discover Multiple Optima [66.98045013486794]
We present an end-to-end method to learn the proximal operator across non-family problems.
We show that for weakly-ized objectives and under mild conditions, the method converges globally.
arXiv Detail & Related papers (2022-01-28T05:53:28Z) - Interpretable and Fair Boolean Rule Sets via Column Generation [18.08486863429421]
An integer program is formulated to optimally trade classification accuracy for rule simplicity.
We consider the fairness setting and extend the formulation to include explicit constraints on two different measures of classification parity.
Compared to other fair and interpretable classifiers, our method is able to find rule sets that meet stricter notions of fairness with a modest trade-off in accuracy.
arXiv Detail & Related papers (2021-11-16T13:40:28Z) - Better Short than Greedy: Interpretable Models through Optimal Rule
Boosting [10.938624307941197]
Rule ensembles are designed to provide a useful trade-off between predictive accuracy and model interpretability.
We present a novel approach aiming to fit rule ensembles of maximal predictive power for a given ensemble size.
arXiv Detail & Related papers (2021-01-21T01:03:48Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - Screening Rules and its Complexity for Active Set Identification [16.762870396299334]
We show that screening rules stem from a combination of natural properties of subdifferential sets and optimality conditions.
Under mild assumptions, we analyze the number of iterations needed to identify the optimal active set for any converging algorithm.
arXiv Detail & Related papers (2020-09-06T11:10:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.