Exploring the Whole Rashomon Set of Sparse Decision Trees
- URL: http://arxiv.org/abs/2209.08040v1
- Date: Fri, 16 Sep 2022 16:37:26 GMT
- Title: Exploring the Whole Rashomon Set of Sparse Decision Trees
- Authors: Rui Xin, Chudi Zhong, Zhi Chen, Takuya Takagi, Margo Seltzer, Cynthia
Rudin
- Abstract summary: We show that the Rashomon set is the set of all almost-optimal models.
We provide the first technique for completely enumerating the Rashomon set for sparse decision trees.
This allows the user an unprecedented level of control over model choice.
- Score: 23.136590456299007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In any given machine learning problem, there may be many models that could
explain the data almost equally well. However, most learning algorithms return
only one of these models, leaving practitioners with no practical way to
explore alternative models that might have desirable properties beyond what
could be expressed within a loss function. The Rashomon set is the set of these
all almost-optimal models. Rashomon sets can be extremely complicated,
particularly for highly nonlinear function classes that allow complex
interaction terms, such as decision trees. We provide the first technique for
completely enumerating the Rashomon set for sparse decision trees; in fact, our
work provides the first complete enumeration of any Rashomon set for a
non-trivial problem with a highly nonlinear discrete function class. This
allows the user an unprecedented level of control over model choice among all
models that are approximately equally good. We represent the Rashomon set in a
specialized data structure that supports efficient querying and sampling. We
show three applications of the Rashomon set: 1) it can be used to study
variable importance for the set of almost-optimal trees (as opposed to a single
tree), 2) the Rashomon set for accuracy enables enumeration of the Rashomon
sets for balanced accuracy and F1-score, and 3) the Rashomon set for a full
dataset can be used to produce Rashomon sets constructed with only subsets of
the data set. Thus, we are able to examine Rashomon sets across problems with a
new lens, enabling users to choose models rather than be at the mercy of an
algorithm that produces only a single model.
Related papers
- Representing Model Weights with Language using Tree Experts [39.90685550999956]
This paper learns to represent models within a joint space that embeds both model weights and language.
We introduce Probing Experts (ProbeX), a theoretically motivated, lightweight probing method.
Our results show that ProbeX can effectively map the weights of large models into a shared weight-language embedding space.
arXiv Detail & Related papers (2024-10-17T17:17:09Z) - Efficient Exploration of the Rashomon Set of Rule Set Models [18.187800166484507]
An emerging paradigm in interpretable machine learning aims at exploring the Rashomon set of all models exhibiting near-optimal performance.
Existing work on Rashomon-set exploration focuses on exhaustive search of the Rashomon set for particular classes of models.
We propose, for the first time, efficient methods to explore the Rashomon set of rule set models with or without exhaustive search.
arXiv Detail & Related papers (2024-06-05T08:37:41Z) - A Path to Simpler Models Starts With Noise [17.36067410506525]
The Rashomon set is the set of models that perform approximately equally well on a given dataset.
An open question is why Rashomon ratios often tend to be large.
We show that noisier datasets lead to larger Rashomon ratios through the way that practitioners train models.
arXiv Detail & Related papers (2023-10-30T16:52:57Z) - Exploring and Interacting with the Set of Good Sparse Generalized
Additive Models [26.64299550434767]
We present algorithms to approximate the Rashomon set of sparse, generalized additive models with ellipsoids for fixed support sets.
The approximated Rashomon set serves as a cornerstone to solve practical challenges such as (1) studying the variable importance for the model class; (2) finding models under user-specified constraints (monotonicity, direct editing); and (3) investigating sudden changes in the shape functions.
arXiv Detail & Related papers (2023-03-28T15:25:46Z) - Composite Feature Selection using Deep Ensembles [130.72015919510605]
We investigate the problem of discovering groups of predictive features without predefined grouping.
We introduce a novel deep learning architecture that uses an ensemble of feature selection models to find predictive groups.
We propose a new metric to measure similarity between discovered groups and the ground truth.
arXiv Detail & Related papers (2022-11-01T17:49:40Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Universal and data-adaptive algorithms for model selection in linear
contextual bandits [52.47796554359261]
We consider the simplest non-trivial instance of model-selection: distinguishing a simple multi-armed bandit problem from a linear contextual bandit problem.
We introduce new algorithms that explore in a data-adaptive manner and provide guarantees of the form $mathcalO(dalpha T1- alpha)$.
Our approach extends to model selection among nested linear contextual bandits under some additional assumptions.
arXiv Detail & Related papers (2021-11-08T18:05:35Z) - Partial Order in Chaos: Consensus on Feature Attributions in the
Rashomon Set [50.67431815647126]
Post-hoc global/local feature attribution methods are being progressively employed to understand machine learning models.
We show that partial orders of local/global feature importance arise from this methodology.
We show that every relation among features present in these partial orders also holds in the rankings provided by existing approaches.
arXiv Detail & Related papers (2021-10-26T02:53:14Z) - Data Summarization via Bilevel Optimization [48.89977988203108]
A simple yet powerful approach is to operate on small subsets of data.
In this work, we propose a generic coreset framework that formulates the coreset selection as a cardinality-constrained bilevel optimization problem.
arXiv Detail & Related papers (2021-09-26T09:08:38Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.