The Evolution of Out-of-Distribution Robustness Throughout Fine-Tuning
- URL: http://arxiv.org/abs/2106.15831v1
- Date: Wed, 30 Jun 2021 06:21:42 GMT
- Title: The Evolution of Out-of-Distribution Robustness Throughout Fine-Tuning
- Authors: Anders Andreassen, Yasaman Bahri, Behnam Neyshabur, Rebecca Roelofs
- Abstract summary: Models that are more accurate on the out-of-distribution data relative to this baseline exhibit "effective robustness"
We find that models pre-trained on larger datasets exhibit effective robustness during training that vanishes at convergence.
We discuss several strategies for scaling effective robustness to the high-accuracy regime to improve the out-of-distribution accuracy of state-of-the-art models.
- Score: 25.85044477227461
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although machine learning models typically experience a drop in performance
on out-of-distribution data, accuracies on in- versus out-of-distribution data
are widely observed to follow a single linear trend when evaluated across a
testbed of models. Models that are more accurate on the out-of-distribution
data relative to this baseline exhibit "effective robustness" and are
exceedingly rare. Identifying such models, and understanding their properties,
is key to improving out-of-distribution performance. We conduct a thorough
empirical investigation of effective robustness during fine-tuning and
surprisingly find that models pre-trained on larger datasets exhibit effective
robustness during training that vanishes at convergence. We study how
properties of the data influence effective robustness, and we show that it
increases with the larger size, more diversity, and higher example difficulty
of the dataset. We also find that models that display effective robustness are
able to correctly classify 10% of the examples that no other current testbed
model gets correct. Finally, we discuss several strategies for scaling
effective robustness to the high-accuracy regime to improve the
out-of-distribution accuracy of state-of-the-art models.
Related papers
- Clarifying Myths About the Relationship Between Shape Bias, Accuracy, and Robustness [18.55761892159021]
Deep learning models can perform well when evaluated on images from the same distribution as the training set.
Deep learning models can perform well when evaluated on images from the same distribution as the training set.
Applying small blurrings to a model's input image and feeding the model with out-of-distribution (OOD) data can significantly drop the model's accuracy.
Data augmentation is one of the well-practiced methods to improve model robustness against OOD data.
arXiv Detail & Related papers (2024-06-07T15:21:00Z) - Bigger is not Always Better: Scaling Properties of Latent Diffusion Models [46.52780730073693]
We study the scaling properties of latent diffusion models (LDMs) with an emphasis on their sampling efficiency.
We conduct an in-depth investigation into how model size influences sampling efficiency across varying sampling steps.
Our findings unveil a surprising trend: when operating under a given inference budget, smaller models frequently outperform their larger equivalents in generating high-quality results.
arXiv Detail & Related papers (2024-04-01T17:59:48Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - Orthogonal Uncertainty Representation of Data Manifold for Robust
Long-Tailed Learning [52.021899899683675]
In scenarios with long-tailed distributions, the model's ability to identify tail classes is limited due to the under-representation of tail samples.
We propose an Orthogonal Uncertainty Representation (OUR) of feature embedding and an end-to-end training strategy to improve the long-tail phenomenon of model robustness.
arXiv Detail & Related papers (2023-10-16T05:50:34Z) - Investigating Ensemble Methods for Model Robustness Improvement of Text
Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases.
By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z) - Are Sample-Efficient NLP Models More Robust? [90.54786862811183]
We investigate the relationship between sample efficiency (amount of data needed to reach a given ID accuracy) and robustness (how models fare on OOD evaluation)
We find that higher sample efficiency is only correlated with better average OOD robustness on some modeling interventions and tasks, but not others.
These results suggest that general-purpose methods for improving sample efficiency are unlikely to yield universal OOD robustness improvements, since such improvements are highly dataset- and task-dependent.
arXiv Detail & Related papers (2022-10-12T17:54:59Z) - No One Representation to Rule Them All: Overlapping Features of Training
Methods [12.58238785151714]
High-performing models tend to make similar predictions regardless of training methodology.
Recent work has made very different training techniques, such as large-scale contrastive learning, yield competitively-high accuracy.
We show these models specialize in generalization of the data, leading to higher ensemble performance.
arXiv Detail & Related papers (2021-10-20T21:29:49Z) - A Multi-Level Attention Model for Evidence-Based Fact Checking [58.95413968110558]
We present a simple model that can be trained on sequence structures.
Results on a large-scale dataset for Fact Extraction and VERification show that our model outperforms the graph-based approaches.
arXiv Detail & Related papers (2021-06-02T05:40:12Z) - How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance.
We formulate a quality measure for the data set, which we refer to as $rho$-gap.
We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.