Toward Learning Human-aligned Cross-domain Robust Models by Countering
Misaligned Features
- URL: http://arxiv.org/abs/2111.03740v1
- Date: Fri, 5 Nov 2021 22:14:41 GMT
- Title: Toward Learning Human-aligned Cross-domain Robust Models by Countering
Misaligned Features
- Authors: Haohan Wang, Zeyi Huang, Hanlin Zhang, Eric Xing
- Abstract summary: Machine learning has demonstrated remarkable prediction accuracy over i.i.d data, but the accuracy often drops when tested with data from another distribution.
In this paper, we aim to offer another view of this problem in a perspective assuming the reason behind this accuracy drop is the reliance of models on the features that are not aligned well with how a data annotator considers similar.
We extend the conventional generalization error bound to a new one for this setup with the knowledge of how the misaligned features are associated with the label.
- Score: 17.57706440574503
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Machine learning has demonstrated remarkable prediction accuracy over i.i.d
data, but the accuracy often drops when tested with data from another
distribution. In this paper, we aim to offer another view of this problem in a
perspective assuming the reason behind this accuracy drop is the reliance of
models on the features that are not aligned well with how a data annotator
considers similar across these two datasets. We refer to these features as
misaligned features. We extend the conventional generalization error bound to a
new one for this setup with the knowledge of how the misaligned features are
associated with the label. Our analysis offers a set of techniques for this
problem, and these techniques are naturally linked to many previous methods in
robust machine learning literature. We also compared the empirical strength of
these methods demonstrated the performance when these previous techniques are
combined.
Related papers
- The Star Geometry of Critic-Based Regularizer Learning [2.2530496464901106]
Variational regularization is a technique to solve statistical inference tasks and inverse problems.
Recent works learn task-dependent regularizers by integrating information about the measurements and ground-truth data.
There is little theory about the structure of regularizers learned via this process and how it relates to the two data distributions.
arXiv Detail & Related papers (2024-08-29T18:34:59Z) - Debiasing Machine Unlearning with Counterfactual Examples [31.931056076782202]
We analyze the causal factors behind the unlearning process and mitigate biases at both data and algorithmic levels.
We introduce an intervention-based approach, where knowledge to forget is erased with a debiased dataset.
Our method outperforms existing machine unlearning baselines on evaluation metrics.
arXiv Detail & Related papers (2024-04-24T09:33:10Z) - Instance-Specific Asymmetric Sensitivity in Differential Privacy [2.855485723554975]
We build upon previous work that gives a paradigm for selecting an output through the exponential mechanism.
Our framework will slightly modify the closeness metric and instead give a simple and efficient application of the sparse vector technique.
arXiv Detail & Related papers (2023-11-02T05:01:45Z) - In-Context Convergence of Transformers [63.04956160537308]
We study the learning dynamics of a one-layer transformer with softmax attention trained via gradient descent.
For data with imbalanced features, we show that the learning dynamics take a stage-wise convergence process.
arXiv Detail & Related papers (2023-10-08T17:55:33Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - Certifying Data-Bias Robustness in Linear Regression [12.00314910031517]
We present a technique for certifying whether linear regression models are pointwise-robust to label bias in a training dataset.
We show how to solve this problem exactly for individual test points, and provide an approximate but more scalable method.
We also unearth gaps in bias-robustness, such as high levels of non-robustness for certain bias assumptions on some datasets.
arXiv Detail & Related papers (2022-06-07T20:47:07Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Semi-supervised Long-tailed Recognition using Alternate Sampling [95.93760490301395]
Main challenges in long-tailed recognition come from the imbalanced data distribution and sample scarcity in its tail classes.
We propose a new recognition setting, namely semi-supervised long-tailed recognition.
We demonstrate significant accuracy improvements over other competitive methods on two datasets.
arXiv Detail & Related papers (2021-05-01T00:43:38Z) - Theoretical bounds on estimation error for meta-learning [29.288915378272375]
We provide novel information-theoretic lower-bounds on minimax rates of convergence for algorithms trained on data from multiple sources and tested on novel data.
Our bounds depend intuitively on the information shared between sources of data, and characterize the difficulty of learning in this setting for arbitrary algorithms.
arXiv Detail & Related papers (2020-10-14T14:57:21Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.