Improved Bayes Risk Can Yield Reduced Social Welfare Under Competition
- URL: http://arxiv.org/abs/2306.14670v3
- Date: Tue, 6 Feb 2024 08:42:12 GMT
- Title: Improved Bayes Risk Can Yield Reduced Social Welfare Under Competition
- Authors: Meena Jagadeesan, Michael I. Jordan, Jacob Steinhardt, Nika Haghtalab
- Abstract summary: In this work, we demonstrate that competition can fundamentally alter the behavior of machine learning scaling trends.
We find many settings where improving data representation quality decreases the overall predictive accuracy across users.
At a conceptual level, our work suggests that favorable scaling trends for individual model-providers need not translate to downstream improvements in social welfare.
- Score: 99.7047087527422
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As the scale of machine learning models increases, trends such as scaling
laws anticipate consistent downstream improvements in predictive accuracy.
However, these trends take the perspective of a single model-provider in
isolation, while in reality providers often compete with each other for users.
In this work, we demonstrate that competition can fundamentally alter the
behavior of these scaling trends, even causing overall predictive accuracy
across users to be non-monotonic or decreasing with scale. We define a model of
competition for classification tasks, and use data representations as a lens
for studying the impact of increases in scale. We find many settings where
improving data representation quality (as measured by Bayes risk) decreases the
overall predictive accuracy across users (i.e., social welfare) for a
marketplace of competing model-providers. Our examples range from closed-form
formulas in simple settings to simulations with pretrained representations on
CIFAR-10. At a conceptual level, our work suggests that favorable scaling
trends for individual model-providers need not translate to downstream
improvements in social welfare in marketplaces with multiple model providers.
Related papers
- Editable Fairness: Fine-Grained Bias Mitigation in Language Models [52.66450426729818]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.
FAST surpasses state-of-the-art baselines with superior debiasing performance.
This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z) - Fair Multivariate Adaptive Regression Splines for Ensuring Equity and
Transparency [1.124958340749622]
We propose a fair predictive model based on MARS that incorporates fairness measures in the learning process.
MARS is a non-parametric regression model that performs feature selection, handles non-linear relationships, generates interpretable decision rules, and derives optimal splitting criteria on the variables.
We apply our fairMARS model to real-world data and demonstrate its effectiveness in terms of accuracy and equity.
arXiv Detail & Related papers (2024-02-23T19:02:24Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Federated Skewed Label Learning with Logits Fusion [23.062650578266837]
Federated learning (FL) aims to collaboratively train a shared model across multiple clients without transmitting their local data.
We propose FedBalance, which corrects the optimization bias among local models by calibrating their logits.
Our method can gain 13% higher average accuracy compared with state-of-the-art methods.
arXiv Detail & Related papers (2023-11-14T14:37:33Z) - Scaling Laws Do Not Scale [54.72120385955072]
Recent work has argued that as the size of a dataset increases, the performance of a model trained on that dataset will increase.
We argue that this scaling law relationship depends on metrics used to measure performance that may not correspond with how different groups of people perceive the quality of models' output.
Different communities may also have values in tension with each other, leading to difficult, potentially irreconcilable choices about metrics used for model evaluations.
arXiv Detail & Related papers (2023-07-05T15:32:21Z) - Maintaining Stability and Plasticity for Predictive Churn Reduction [8.971668467496055]
We propose a solution called Accumulated Model Combination (AMC)
AMC is a general technique and we propose several instances of it, each having their own advantages depending on the model and data properties.
arXiv Detail & Related papers (2023-05-06T20:56:20Z) - Learning Sample Difficulty from Pre-trained Models for Reliable
Prediction [55.77136037458667]
We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization.
We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
arXiv Detail & Related papers (2023-04-20T07:29:23Z) - Non-Invasive Fairness in Learning through the Lens of Data Drift [88.37640805363317]
We show how to improve the fairness of Machine Learning models without altering the data or the learning algorithm.
We use a simple but key insight: the divergence of trends between different populations, and, consecutively, between a learned model and minority populations, is analogous to data drift.
We explore two strategies (model-splitting and reweighing) to resolve this drift, aiming to improve the overall conformance of models to the underlying data.
arXiv Detail & Related papers (2023-03-30T17:30:42Z) - In the Eye of the Beholder: Robust Prediction with Causal User Modeling [27.294341513692164]
We propose a learning framework for relevance prediction that is robust to changes in the data distribution.
Our key observation is that robustness can be obtained by accounting for how users causally perceive the environment.
arXiv Detail & Related papers (2022-06-01T11:33:57Z) - Newer is not always better: Rethinking transferability metrics, their
peculiarities, stability and performance [5.650647159993238]
Fine-tuning of large pre-trained image and language models on small customized datasets has become increasingly popular.
We show that the statistical problems with covariance estimation drive the poor performance of H-score.
We propose a correction and recommend measuring correlation performance against relative accuracy in such settings.
arXiv Detail & Related papers (2021-10-13T17:24:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.