Calibrated ensembles can mitigate accuracy tradeoffs under distribution
shift
- URL: http://arxiv.org/abs/2207.08977v1
- Date: Mon, 18 Jul 2022 23:14:44 GMT
- Title: Calibrated ensembles can mitigate accuracy tradeoffs under distribution
shift
- Authors: Ananya Kumar and Tengyu Ma and Percy Liang and Aditi Raghunathan
- Abstract summary: We find that ID-calibrated ensembles outperforms prior state-of-the-art (based on self-training) on both ID and OOD accuracy.
We analyze this method in stylized settings, and identify two important conditions for ensembles to perform well both ID and OOD.
- Score: 108.30303219703845
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We often see undesirable tradeoffs in robust machine learning where
out-of-distribution (OOD) accuracy is at odds with in-distribution (ID)
accuracy: a robust classifier obtained via specialized techniques such as
removing spurious features often has better OOD but worse ID accuracy compared
to a standard classifier trained via ERM. In this paper, we find that
ID-calibrated ensembles -- where we simply ensemble the standard and robust
models after calibrating on only ID data -- outperforms prior state-of-the-art
(based on self-training) on both ID and OOD accuracy. On eleven natural
distribution shift datasets, ID-calibrated ensembles obtain the best of both
worlds: strong ID accuracy and OOD accuracy. We analyze this method in stylized
settings, and identify two important conditions for ensembles to perform well
both ID and OOD: (1) we need to calibrate the standard and robust models (on ID
data, because OOD data is unavailable), (2) OOD has no anticorrelated spurious
features.
Related papers
- How Does Unlabeled Data Provably Help Out-of-Distribution Detection? [63.41681272937562]
Unlabeled in-the-wild data is non-trivial due to the heterogeneity of both in-distribution (ID) and out-of-distribution (OOD) data.
This paper introduces a new learning framework SAL (Separate And Learn) that offers both strong theoretical guarantees and empirical effectiveness.
arXiv Detail & Related papers (2024-02-05T20:36:33Z) - Out-of-Distribution Knowledge Distillation via Confidence Amendment [50.56321442948141]
Out-of-distribution (OOD) detection is essential in identifying test samples that deviate from the in-distribution (ID) data upon which a standard network is trained.
This paper introduces OOD knowledge distillation, a pioneering learning framework applicable whether or not training ID data is available.
This framework harnesses OOD-sensitive knowledge from the standard network to craft a binary classifier adept at distinguishing between ID and OOD samples.
arXiv Detail & Related papers (2023-11-14T08:05:02Z) - Out-of-distribution Detection Learning with Unreliable
Out-of-distribution Sources [73.28967478098107]
Out-of-distribution (OOD) detection discerns OOD data where the predictor cannot make valid predictions as in-distribution (ID) data.
It is typically hard to collect real out-of-distribution (OOD) data for training a predictor capable of discerning OOD patterns.
We propose a data generation-based learning method named Auxiliary Task-based OOD Learning (ATOL) that can relieve the mistaken OOD generation.
arXiv Detail & Related papers (2023-11-06T16:26:52Z) - Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and calibration error in Vision Language Models (VLMs)
Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z) - OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift [20.14559162084261]
OODRobustBench is used to assess 706 robust models using 60.7K adversarial evaluations.
This large-scale analysis shows that adversarial robustness suffers from a severe OOD generalization issue.
We then predict and verify that existing methods are unlikely to achieve high OOD robustness.
arXiv Detail & Related papers (2023-10-19T14:50:46Z) - Agreement-on-the-Line: Predicting the Performance of Neural Networks
under Distribution Shift [18.760716606922482]
We show a similar but surprising phenomenon also holds for the agreement between pairs of neural network classifiers.
Our prediction algorithm outperforms previous methods both in shifts where agreement-on-the-line holds and, surprisingly, when accuracy is not on the line.
arXiv Detail & Related papers (2022-06-27T07:50:47Z) - Training OOD Detectors in their Natural Habitats [31.565635192716712]
Out-of-distribution (OOD) detection is important for machine learning models deployed in the wild.
Recent methods use auxiliary outlier data to regularize the model for improved OOD detection.
We propose a novel framework that leverages wild mixture data -- that naturally consists of both ID and OOD samples.
arXiv Detail & Related papers (2022-02-07T15:38:39Z) - Provably Robust Detection of Out-of-distribution Data (almost) for free [124.14121487542613]
Deep neural networks are known to produce highly overconfident predictions on out-of-distribution (OOD) data.
In this paper we propose a novel method where from first principles we combine a certifiable OOD detector with a standard classifier into an OOD aware classifier.
In this way we achieve the best of two worlds: certifiably adversarially robust OOD detection, even for OOD samples close to the in-distribution, without loss in prediction accuracy and close to state-of-the-art OOD detection performance for non-manipulated OOD data.
arXiv Detail & Related papers (2021-06-08T11:40:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.