Distributionally Robust Post-hoc Classifiers under Prior Shifts
- URL: http://arxiv.org/abs/2309.08825v1
- Date: Sat, 16 Sep 2023 00:54:57 GMT
- Title: Distributionally Robust Post-hoc Classifiers under Prior Shifts
- Authors: Jiaheng Wei, Harikrishna Narasimhan, Ehsan Amid, Wen-Sheng Chu, Yang
Liu, and Abhishek Kumar
- Abstract summary: We investigate the problem of training models that are robust to shifts caused by changes in the distribution of class-priors or group-priors.
We present an extremely lightweight post-hoc approach that performs scaling adjustments to predictions from a pre-trained model.
- Score: 31.237674771958165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The generalization ability of machine learning models degrades significantly
when the test distribution shifts away from the training distribution. We
investigate the problem of training models that are robust to shifts caused by
changes in the distribution of class-priors or group-priors. The presence of
skewed training priors can often lead to the models overfitting to spurious
features. Unlike existing methods, which optimize for either the worst or the
average performance over classes or groups, our work is motivated by the need
for finer control over the robustness properties of the model. We present an
extremely lightweight post-hoc approach that performs scaling adjustments to
predictions from a pre-trained model, with the goal of minimizing a
distributionally robust loss around a chosen target distribution. These
adjustments are computed by solving a constrained optimization problem on a
validation set and applied to the model during test time. Our constrained
optimization objective is inspired by a natural notion of robustness to
controlled distribution shifts. Our method comes with provable guarantees and
empirically makes a strong case for distributional robust post-hoc classifiers.
An empirical implementation is available at
https://github.com/weijiaheng/Drops.
Related papers
- Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences [6.067007470552307]
We propose a methodology for finding sequences of machine learning models that are stable across retraining iterations.
We develop a mixed-integer optimization formulation that is guaranteed to recover optimal models.
Our method shows stronger stability than greedily trained models with a small, controllable sacrifice in predictive power.
arXiv Detail & Related papers (2024-03-28T22:45:38Z) - Mitigating the Bias in the Model for Continual Test-Time Adaptation [32.33057968481597]
Continual Test-Time Adaptation (CTA) is a challenging task that aims to adapt a source pre-trained model to continually changing target domains.
We find that a model shows highly biased predictions as it constantly adapts to the chaining distribution of the target data.
This paper mitigates this issue to improve performance in the CTA scenario.
arXiv Detail & Related papers (2024-03-02T23:37:16Z) - Ask Your Distribution Shift if Pre-Training is Right for You [74.18516460467019]
In practice, fine-tuning a pre-trained model improves robustness significantly in some cases but not at all in others.
We focus on two possible failure modes of models under distribution shift: poor extrapolation and biases in the training data.
Our study suggests that, as a rule of thumb, pre-training can help mitigate poor extrapolation but not dataset biases.
arXiv Detail & Related papers (2024-02-29T23:46:28Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - Robustness, Evaluation and Adaptation of Machine Learning Models in the
Wild [4.304803366354879]
We study causes of impaired robustness to domain shifts and present algorithms for training domain robust models.
A key source of model brittleness is due to domain overfitting, which our new training algorithms suppress and instead encourage domain-general hypotheses.
arXiv Detail & Related papers (2023-03-05T21:41:16Z) - CLIPood: Generalizing CLIP to Out-of-Distributions [73.86353105017076]
Contrastive language-image pre-training (CLIP) models have shown impressive zero-shot ability, but the further adaptation of CLIP on downstream tasks undesirably degrades OOD performances.
We propose CLIPood, a fine-tuning method that can adapt CLIP models to OOD situations where both domain shifts and open classes may occur on unseen test data.
Experiments on diverse datasets with different OOD scenarios show that CLIPood consistently outperforms existing generalization techniques.
arXiv Detail & Related papers (2023-02-02T04:27:54Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Predicting with Confidence on Unseen Distributions [90.68414180153897]
We connect domain adaptation and predictive uncertainty literature to predict model accuracy on challenging unseen distributions.
We find that the difference of confidences (DoC) of a classifier's predictions successfully estimates the classifier's performance change over a variety of shifts.
We specifically investigate the distinction between synthetic and natural distribution shifts and observe that despite its simplicity DoC consistently outperforms other quantifications of distributional difference.
arXiv Detail & Related papers (2021-07-07T15:50:18Z) - Deep Ensembles for Low-Data Transfer Learning [21.578470914935938]
We study different ways of creating ensembles from pre-trained models.
We show that the nature of pre-training itself is a performant source of diversity.
We propose a practical algorithm that efficiently identifies a subset of pre-trained models for any downstream dataset.
arXiv Detail & Related papers (2020-10-14T07:59:00Z) - Estimating Generalization under Distribution Shifts via Domain-Invariant
Representations [75.74928159249225]
We use a set of domain-invariant predictors as a proxy for the unknown, true target labels.
The error of the resulting risk estimate depends on the target risk of the proxy model.
arXiv Detail & Related papers (2020-07-06T17:21:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.