A Path to Simpler Models Starts With Noise
- URL: http://arxiv.org/abs/2310.19726v1
- Date: Mon, 30 Oct 2023 16:52:57 GMT
- Title: A Path to Simpler Models Starts With Noise
- Authors: Lesia Semenova, Harry Chen, Ronald Parr, Cynthia Rudin
- Abstract summary: The Rashomon set is the set of models that perform approximately equally well on a given dataset.
An open question is why Rashomon ratios often tend to be large.
We show that noisier datasets lead to larger Rashomon ratios through the way that practitioners train models.
- Score: 17.36067410506525
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Rashomon set is the set of models that perform approximately equally well
on a given dataset, and the Rashomon ratio is the fraction of all models in a
given hypothesis space that are in the Rashomon set. Rashomon ratios are often
large for tabular datasets in criminal justice, healthcare, lending, education,
and in other areas, which has practical implications about whether simpler
models can attain the same level of accuracy as more complex models. An open
question is why Rashomon ratios often tend to be large. In this work, we
propose and study a mechanism of the data generation process, coupled with
choices usually made by the analyst during the learning process, that
determines the size of the Rashomon ratio. Specifically, we demonstrate that
noisier datasets lead to larger Rashomon ratios through the way that
practitioners train models. Additionally, we introduce a measure called pattern
diversity, which captures the average difference in predictions between
distinct classification patterns in the Rashomon set, and motivate why it tends
to increase with label noise. Our results explain a key aspect of why simpler
models often tend to perform as well as black box models on complex, noisier
datasets.
Related papers
- Efficient Exploration of the Rashomon Set of Rule Set Models [18.187800166484507]
An emerging paradigm in interpretable machine learning aims at exploring the Rashomon set of all models exhibiting near-optimal performance.
Existing work on Rashomon-set exploration focuses on exhaustive search of the Rashomon set for particular classes of models.
We propose, for the first time, efficient methods to explore the Rashomon set of rule set models with or without exhaustive search.
arXiv Detail & Related papers (2024-06-05T08:37:41Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Random Models for Fuzzy Clustering Similarity Measures [0.0]
The Adjusted Rand Index (ARI) is a widely used method for comparing hard clusterings.
We propose a single framework for computing the ARI with three random models that are intuitive and explainable for both hard and fuzzy clusterings.
arXiv Detail & Related papers (2023-12-16T00:07:04Z) - Exploring and Interacting with the Set of Good Sparse Generalized
Additive Models [26.64299550434767]
We present algorithms to approximate the Rashomon set of sparse, generalized additive models with ellipsoids for fixed support sets.
The approximated Rashomon set serves as a cornerstone to solve practical challenges such as (1) studying the variable importance for the model class; (2) finding models under user-specified constraints (monotonicity, direct editing); and (3) investigating sudden changes in the shape functions.
arXiv Detail & Related papers (2023-03-28T15:25:46Z) - Simplicity Bias Leads to Amplified Performance Disparities [8.60453031364566]
We show that SGD-trained models have a bias towards simplicity, leading them to prioritize learning a majority class.
A model may prioritize any class or group of the dataset that it finds simple-at the expense of what it finds complex.
arXiv Detail & Related papers (2022-12-13T15:24:41Z) - Exploring the Whole Rashomon Set of Sparse Decision Trees [23.136590456299007]
We show that the Rashomon set is the set of all almost-optimal models.
We provide the first technique for completely enumerating the Rashomon set for sparse decision trees.
This allows the user an unprecedented level of control over model choice.
arXiv Detail & Related papers (2022-09-16T16:37:26Z) - Partial Order in Chaos: Consensus on Feature Attributions in the
Rashomon Set [50.67431815647126]
Post-hoc global/local feature attribution methods are being progressively employed to understand machine learning models.
We show that partial orders of local/global feature importance arise from this methodology.
We show that every relation among features present in these partial orders also holds in the rankings provided by existing approaches.
arXiv Detail & Related papers (2021-10-26T02:53:14Z) - Self-Damaging Contrastive Learning [92.34124578823977]
Unlabeled data in reality is commonly imbalanced and shows a long-tail distribution.
This paper proposes a principled framework called Self-Damaging Contrastive Learning to automatically balance the representation learning without knowing the classes.
Our experiments show that SDCLR significantly improves not only overall accuracies but also balancedness.
arXiv Detail & Related papers (2021-06-06T00:04:49Z) - On the Efficacy of Adversarial Data Collection for Question Answering:
Results from a Large-Scale Randomized Study [65.17429512679695]
In adversarial data collection (ADC), a human workforce interacts with a model in real time, attempting to produce examples that elicit incorrect predictions.
Despite ADC's intuitive appeal, it remains unclear when training on adversarial datasets produces more robust models.
arXiv Detail & Related papers (2021-06-02T00:48:33Z) - Why do classifier accuracies show linear trends under distribution
shift? [58.40438263312526]
accuracies of models on one data distribution are approximately linear functions of the accuracies on another distribution.
We assume the probability that two models agree in their predictions is higher than what we can infer from their accuracy levels alone.
We show that a linear trend must occur when evaluating models on two distributions unless the size of the distribution shift is large.
arXiv Detail & Related papers (2020-12-31T07:24:30Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.