Related papers: Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation

Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation

URL: http://arxiv.org/abs/2406.19049v1
Date: Thu, 27 Jun 2024 09:57:31 GMT
Title: Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation
Authors: Amartya Sanyal, Yaxi Hu, Yaodong Yu, Yian Ma, Yixin Wang, Bernhard Schölkopf,
Abstract summary: We show that noisy data and nuisance features can be sufficient to shatter the Accuracy-on-the-line phenomenon. We demonstrate this phenomenon across both synthetic and real datasets with noisy data and nuisance features.
Score: 70.36344590967519
License: http://creativecommons.org/licenses/by/4.0/
Abstract: "Accuracy-on-the-line" is a widely observed phenomenon in machine learning, where a model's accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisance features can be sufficient to shatter the Accuracy-on-the-line phenomenon. In these cases, ID and OOD accuracy can become negatively correlated, leading to "Accuracy-on-the-wrong-line". This phenomenon can also occur in the presence of spurious (shortcut) features, which tend to overshadow the more complex signal (core, non-spurious) features, resulting in a large nuisance feature space. Moreover, scaling to larger datasets does not mitigate this undesirable behavior and may even exacerbate it. We formally prove a lower bound on Out-of-distribution (OOD) error in a linear classification model, characterizing the conditions on the noise and nuisance features for a large OOD error. We finally demonstrate this phenomenon across both synthetic and real datasets with noisy data and nuisance features.

Related papers

Are Domain Generalization Benchmarks with Accuracy on the Line Misspecified? [11.534630666670568]
Spurious correlations, unstable statistical shortcuts a model can exploit, are expected to degrade performance out-of-distribution.<n>We show that current practice evaluates "robustness" without truly stressing the spurious signals we seek to eliminate.
arXiv Detail & Related papers (2025-03-31T19:50:04Z)
Orthogonal Uncertainty Representation of Data Manifold for Robust Long-Tailed Learning [52.021899899683675]
In scenarios with long-tailed distributions, the model's ability to identify tail classes is limited due to the under-representation of tail samples. We propose an Orthogonal Uncertainty Representation (OUR) of feature embedding and an end-to-end training strategy to improve the long-tail phenomenon of model robustness.
arXiv Detail & Related papers (2023-10-16T05:50:34Z)
Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line [65.14099135546594]
Recent test-time adaptation (TTA) methods drastically strengthen the ACL and AGL trends in models, even in shifts where models showed very weak correlations before. Our results show that by combining TTA with AGL-based estimation methods, we can estimate the OOD performance of models with high precision for a broader set of distribution shifts.
arXiv Detail & Related papers (2023-10-07T23:21:25Z)
Understanding the Impact of Adversarial Robustness on Accuracy Disparity [18.643495650734398]
We decompose the impact of adversarial robustness into two parts: an inherent effect that will degrade the standard accuracy on all classes due to the robustness constraint, and the other caused by the class imbalance ratio. Our results suggest that the implications may extend to nonlinear models over real-world datasets.
arXiv Detail & Related papers (2022-11-28T20:46:51Z)
Agreement-on-the-Line: Predicting the Performance of Neural Networks under Distribution Shift [18.760716606922482]
We show a similar but surprising phenomenon also holds for the agreement between pairs of neural network classifiers. Our prediction algorithm outperforms previous methods both in shifts where agreement-on-the-line holds and, surprisingly, when accuracy is not on the line.
arXiv Detail & Related papers (2022-06-27T07:50:47Z)
Robustness and Accuracy Could Be Reconcilable by (Proper) Definition [109.62614226793833]
The trade-off between robustness and accuracy has been widely studied in the adversarial literature. We find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance. By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty.
arXiv Detail & Related papers (2022-02-21T10:36:09Z)
Classification and Adversarial examples in an Overparameterized Linear Model: A Signal Processing Perspective [10.515544361834241]
State-of-the-art deep learning classifiers are highly susceptible to infinitesmal adversarial perturbations. We find that the learned model is susceptible to adversaries in an intermediate regime where classification generalizes but regression does not. Despite the adversarial susceptibility, we find that classification with these features can be easier than the more commonly studied "independent feature" models.
arXiv Detail & Related papers (2021-09-27T17:35:42Z)
An Investigation of the (In)effectiveness of Counterfactually Augmented Data [10.316235366821111]
We show that while counterfactually-augmented data (CAD) is effective at identifying robust features, it may prevent the model from learning unperturbed robust features. Our results show that the lack of perturbation diversity in current CAD datasets limits its effectiveness on OOD generalization.
arXiv Detail & Related papers (2021-07-01T21:46:43Z)
Removing Spurious Features can Hurt Accuracy and Affect Groups Disproportionately [83.68135652247496]
A natural remedy is to remove spurious features from the model. We show that removal of spurious features can decrease accuracy due to inductive biases. We also show that robust self-training can remove spurious features without affecting the overall accuracy.
arXiv Detail & Related papers (2020-12-07T23:08:59Z)
Learning Causal Models Online [103.87959747047158]
Predictive models can rely on spurious correlations in the data for making predictions. One solution for achieving strong generalization is to incorporate causal structures in the models. We propose an online algorithm that continually detects and removes spurious features.
arXiv Detail & Related papers (2020-06-12T20:49:20Z)
Linear predictor on linearly-generated data with missing values: non consistency and solutions [0.0]
We study the seemingly-simple case where the target to predict is a linear function of the fully-observed data. We show that, in the presence of missing values, the optimal predictor may not be linear.
arXiv Detail & Related papers (2020-02-03T11:49:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.