Predictive Churn with the Set of Good Models
- URL: http://arxiv.org/abs/2402.07745v1
- Date: Mon, 12 Feb 2024 16:15:25 GMT
- Title: Predictive Churn with the Set of Good Models
- Authors: Jamelle Watson-Daniels, Flavio du Pin Calmon, Alexander D'Amour, Carol
Long, David C. Parkes, Berk Ustun
- Abstract summary: We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
- Score: 64.05949860750235
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models in modern mass-market applications are often updated
over time. One of the foremost challenges faced is that, despite increasing
overall performance, these updates may flip specific model predictions in
unpredictable ways. In practice, researchers quantify the number of unstable
predictions between models pre and post update -- i.e., predictive churn. In
this paper, we study this effect through the lens of predictive multiplicity --
i.e., the prevalence of conflicting predictions over the set of near-optimal
models (the Rashomon set). We show how traditional measures of predictive
multiplicity can be used to examine expected churn over this set of prospective
models -- i.e., the set of models that may be used to replace a baseline model
in deployment. We present theoretical results on the expected churn between
models within the Rashomon set from different perspectives. And we characterize
expected churn over model updates via the Rashomon set, pairing our analysis
with empirical results on real-world datasets -- showing how our approach can
be used to better anticipate, reduce, and avoid churn in consumer-facing
applications. Further, we show that our approach is useful even for models
enhanced with uncertainty awareness.
Related papers
- Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - An Experimental Study on the Rashomon Effect of Balancing Methods in Imbalanced Classification [0.0]
This paper examines the impact of balancing methods on predictive multiplicity using the Rashomon effect.
It is crucial because the blind model selection in data-centric AI is risky from a set of approximately equally accurate models.
arXiv Detail & Related papers (2024-03-22T13:08:22Z) - Multi-View Conformal Learning for Heterogeneous Sensor Fusion [0.12086712057375555]
We build and test multi-view and single-view conformal models for heterogeneous sensor fusion.
Our models provide theoretical marginal confidence guarantees since they are based on the conformal prediction framework.
Our results also showed that multi-view models generate prediction sets with less uncertainty compared to single-view models.
arXiv Detail & Related papers (2024-02-19T17:30:09Z) - EAMDrift: An interpretable self retrain model for time series [0.0]
We present EAMDrift, a novel method that combines forecasts from multiple individual predictors by weighting each prediction according to a performance metric.
EAMDrift is designed to automatically adapt to out-of-distribution patterns in data and identify the most appropriate models to use at each moment.
Our study on real-world datasets shows that EAMDrift outperforms individual baseline models by 20% and achieves comparable accuracy results to non-interpretable ensemble models.
arXiv Detail & Related papers (2023-05-31T13:25:26Z) - Investigating Ensemble Methods for Model Robustness Improvement of Text
Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases.
By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z) - Predictive Multiplicity in Probabilistic Classification [25.111463701666864]
We present a framework for measuring predictive multiplicity in probabilistic classification.
We demonstrate the incidence and prevalence of predictive multiplicity in real-world tasks.
Our results emphasize the need to report predictive multiplicity more widely.
arXiv Detail & Related papers (2022-06-02T16:25:29Z) - Consistent Counterfactuals for Deep Models [25.1271020453651]
Counterfactual examples are used to explain predictions of machine learning models in key areas such as finance and medical diagnosis.
This paper studies the consistency of model prediction on counterfactual examples in deep networks under small changes to initial training conditions.
arXiv Detail & Related papers (2021-10-06T23:48:55Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - Learning Interpretable Deep State Space Model for Probabilistic Time
Series Forecasting [98.57851612518758]
Probabilistic time series forecasting involves estimating the distribution of future based on its history.
We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks.
We show in experiments that our model produces accurate and sharp probabilistic forecasts.
arXiv Detail & Related papers (2021-01-31T06:49:33Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.