Related papers: Predictive Churn with the Set of Good Models

Predictive Churn with the Set of Good Models

URL: http://arxiv.org/abs/2402.07745v1
Date: Mon, 12 Feb 2024 16:15:25 GMT
Title: Predictive Churn with the Set of Good Models
Authors: Jamelle Watson-Daniels, Flavio du Pin Calmon, Alexander D'Amour, Carol Long, David C. Parkes, Berk Ustun
Abstract summary: We study the effect of conflicting predictions over the set of near-optimal machine learning models. We present theoretical results on the expected churn between models within the Rashomon set. We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
Score: 64.05949860750235
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning models in modern mass-market applications are often updated over time. One of the foremost challenges faced is that, despite increasing overall performance, these updates may flip specific model predictions in unpredictable ways. In practice, researchers quantify the number of unstable predictions between models pre and post update -- i.e., predictive churn. In this paper, we study this effect through the lens of predictive multiplicity -- i.e., the prevalence of conflicting predictions over the set of near-optimal models (the Rashomon set). We show how traditional measures of predictive multiplicity can be used to examine expected churn over this set of prospective models -- i.e., the set of models that may be used to replace a baseline model in deployment. We present theoretical results on the expected churn between models within the Rashomon set from different perspectives. And we characterize expected churn over model updates via the Rashomon set, pairing our analysis with empirical results on real-world datasets -- showing how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications. Further, we show that our approach is useful even for models enhanced with uncertainty awareness.

Related papers

Multi-Level Collaboration in Model Merging [56.31088116526825]
This paper explores the intrinsic connections between model merging and model ensembling. We find that even when previous restrictions are not met, there is still a way for model merging to attain a near-identical and superior performance similar to that of ensembling.
arXiv Detail & Related papers (2025-03-03T07:45:04Z)
Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning. By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z)
Generative vs. Discriminative modeling under the lens of uncertainty quantification [0.929965561686354]
In this paper, we undertake a comparative analysis of generative and discriminative approaches. We compare the ability of both approaches to leverage information from various sources in an uncertainty aware inference. We propose a general sampling scheme enabling supervised learning for both approaches, as well as semi-supervised learning when compatible with the considered modeling approach.
arXiv Detail & Related papers (2024-06-13T14:32:43Z)
Demystifying amortized causal discovery with transformers [21.058343547918053]
Supervised learning approaches for causal discovery from observational data often achieve competitive performance. In this work, we investigate CSIvA, a transformer-based model promising to train on synthetic data and transfer to real data. We bridge the gap with existing identifiability theory and show that constraints on the training data distribution implicitly define a prior on the test observations.
arXiv Detail & Related papers (2024-05-27T08:17:49Z)
An Experimental Study on the Rashomon Effect of Balancing Methods in Imbalanced Classification [0.0]
This paper examines the impact of balancing methods on predictive multiplicity using the Rashomon effect. It is crucial because the blind model selection in data-centric AI is risky from a set of approximately equally accurate models.
arXiv Detail & Related papers (2024-03-22T13:08:22Z)
Multi-View Conformal Learning for Heterogeneous Sensor Fusion [0.12086712057375555]
We build and test multi-view and single-view conformal models for heterogeneous sensor fusion. Our models provide theoretical marginal confidence guarantees since they are based on the conformal prediction framework. Our results also showed that multi-view models generate prediction sets with less uncertainty compared to single-view models.
arXiv Detail & Related papers (2024-02-19T17:30:09Z)
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space. However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts. We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z)
Exploring new ways: Enforcing representational dissimilarity to learn new features and reduce error consistency [1.7497479054352052]
We show that highly dissimilar intermediate representations result in less correlated output predictions and slightly lower error consistency. With this, we shine first light on the connection between intermediate representations and their impact on the output predictions.
arXiv Detail & Related papers (2023-07-05T14:28:46Z)
EAMDrift: An interpretable self retrain model for time series [0.0]
We present EAMDrift, a novel method that combines forecasts from multiple individual predictors by weighting each prediction according to a performance metric. EAMDrift is designed to automatically adapt to out-of-distribution patterns in data and identify the most appropriate models to use at each moment. Our study on real-world datasets shows that EAMDrift outperforms individual baseline models by 20% and achieves comparable accuracy results to non-interpretable ensemble models.
arXiv Detail & Related papers (2023-05-31T13:25:26Z)
Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases. By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z)
Multi-Study Boosting: Theoretical Considerations for Merging vs. Ensembling [2.252304836689618]
Cross-study replicability is a powerful model evaluation criterion that emphasizes generalizability of predictions. We study boosting algorithms in the presence of potential heterogeneity in predictor-outcome relationships across studies. We compare two multi-study learning strategies: 1) merging all the studies and training a single model, and 2) multi-study ensembling.
arXiv Detail & Related papers (2022-07-11T02:25:47Z)
Predictive Multiplicity in Probabilistic Classification [25.111463701666864]
We present a framework for measuring predictive multiplicity in probabilistic classification. We demonstrate the incidence and prevalence of predictive multiplicity in real-world tasks. Our results emphasize the need to report predictive multiplicity more widely.
arXiv Detail & Related papers (2022-06-02T16:25:29Z)
Exploring the Trade-off between Plausibility, Change Intensity and Adversarial Power in Counterfactual Explanations using Multi-objective Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances. We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z)
Consistent Counterfactuals for Deep Models [25.1271020453651]
Counterfactual examples are used to explain predictions of machine learning models in key areas such as finance and medical diagnosis. This paper studies the consistency of model prediction on counterfactual examples in deep networks under small changes to initial training conditions.
arXiv Detail & Related papers (2021-10-06T23:48:55Z)
A Survey on Evidential Deep Learning For Single-Pass Uncertainty Estimation [0.0]
Evidential Deep Learning: For unfamiliar data, they admit "what they don't know" and fall back onto a prior belief. This survey aims to familiarize the reader with an alternative class of models based on the concept of Evidential Deep Learning: For unfamiliar data, they admit "what they don't know" and fall back onto a prior belief.
arXiv Detail & Related papers (2021-10-06T20:13:57Z)
Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters. We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z)
Back2Future: Leveraging Backfill Dynamics for Improving Real-time Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task. 'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature. We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z)
Learning Interpretable Deep State Space Model for Probabilistic Time Series Forecasting [98.57851612518758]
Probabilistic time series forecasting involves estimating the distribution of future based on its history. We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks. We show in experiments that our model produces accurate and sharp probabilistic forecasts.
arXiv Detail & Related papers (2021-01-31T06:49:33Z)
Characterizing Fairness Over the Set of Good Models Under Selective Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance. We provide tractable algorithms to compute the range of attainable group-level predictive disparities. We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z)
Latent Causal Invariant Model [128.7508609492542]
Current supervised learning can learn spurious correlation during the data-fitting process. We propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.
arXiv Detail & Related papers (2020-11-04T10:00:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.