Estimating and Explaining Model Performance When Both Covariates and
Labels Shift
- URL: http://arxiv.org/abs/2209.08436v1
- Date: Sun, 18 Sep 2022 01:16:16 GMT
- Title: Estimating and Explaining Model Performance When Both Covariates and
Labels Shift
- Authors: Lingjiao Chen and Matei Zaharia and James Zou
- Abstract summary: We propose a new distribution shift model, Sparse Joint Shift (SJS), which considers the joint shift of both labels and a few features.
We also propose SEES, an algorithmic framework to characterize the distribution shift under SJS and to estimate a model's performance on new data without any labels.
- Score: 36.94826820536239
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deployed machine learning (ML) models often encounter new user data that
differs from their training data. Therefore, estimating how well a given model
might perform on the new data is an important step toward reliable ML
applications. This is very challenging, however, as the data distribution can
change in flexible ways, and we may not have any labels on the new data, which
is often the case in monitoring settings. In this paper, we propose a new
distribution shift model, Sparse Joint Shift (SJS), which considers the joint
shift of both labels and a few features. This unifies and generalizes several
existing shift models including label shift and sparse covariate shift, where
only marginal feature or label distribution shifts are considered. We describe
mathematical conditions under which SJS is identifiable. We further propose
SEES, an algorithmic framework to characterize the distribution shift under SJS
and to estimate a model's performance on new data without any labels. We
conduct extensive experiments on several real-world datasets with various ML
models. Across different datasets and distribution shifts, SEES achieves
significant (up to an order of magnitude) shift estimation error improvements
over existing approaches.
Related papers
- Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot
Text Classification Tasks [75.42002070547267]
We propose a self evolution learning (SE) based mixup approach for data augmentation in text classification.
We introduce a novel instance specific label smoothing approach, which linearly interpolates the model's output and one hot labels of the original samples to generate new soft for label mixing up.
arXiv Detail & Related papers (2023-05-22T23:43:23Z) - Explanation Shift: How Did the Distribution Shift Impact the Model? [23.403838118256907]
We study how explanation characteristics shift when affected by distribution shifts.
We analyze different types of distribution shifts using synthetic examples and real-world data sets.
We release our methods in an open-source Python package, as well as the code used to reproduce our experiments.
arXiv Detail & Related papers (2023-03-14T17:13:01Z) - Dataset Interfaces: Diagnosing Model Failures Using Controllable
Counterfactual Generation [85.13934713535527]
Distribution shift is a major source of failure for machine learning models.
We introduce the notion of a dataset interface: a framework that, given an input dataset and a user-specified shift, returns instances that exhibit the desired shift.
We demonstrate how applying this dataset interface to the ImageNet dataset enables studying model behavior across a diverse array of distribution shifts.
arXiv Detail & Related papers (2023-02-15T18:56:26Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Combat Data Shift in Few-shot Learning with Knowledge Graph [42.59886121530736]
In real-world applications, few-shot learning paradigm often suffers from data shift.
Most existing few-shot learning approaches are not designed with the consideration of data shift.
We propose a novel metric-based meta-learning framework to extract task-specific representations and task-shared representations.
arXiv Detail & Related papers (2021-01-27T12:35:18Z) - WILDS: A Benchmark of in-the-Wild Distribution Shifts [157.53410583509924]
Distribution shifts can substantially degrade the accuracy of machine learning systems deployed in the wild.
We present WILDS, a curated collection of 8 benchmark datasets that reflect a diverse range of distribution shifts.
We show that standard training results in substantially lower out-of-distribution than in-distribution performance.
arXiv Detail & Related papers (2020-12-14T11:14:56Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.