Distinguishing rule- and exemplar-based generalization in learning
systems
- URL: http://arxiv.org/abs/2110.04328v1
- Date: Fri, 8 Oct 2021 18:37:59 GMT
- Title: Distinguishing rule- and exemplar-based generalization in learning
systems
- Authors: Ishita Dasgupta, Erin Grant, Thomas L. Griffiths
- Abstract summary: We investigate two distinct inductive biases: feature-level bias and exemplar-vs-rule bias.
We find that most standard neural network models have a propensity towards exemplar-based extrapolation.
We discuss the implications of these findings for research on data augmentation, fairness, and systematic generalization.
- Score: 10.396761067379195
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the increasing scale of datasets in machine learning, generalization
to unseen regions of the data distribution remains crucial. Such extrapolation
is by definition underdetermined and is dictated by a learner's inductive
biases. Machine learning systems often do not share the same inductive biases
as humans and, as a result, extrapolate in ways that are inconsistent with our
expectations. We investigate two distinct such inductive biases: feature-level
bias (differences in which features are more readily learned) and
exemplar-vs-rule bias (differences in how these learned features are used for
generalization). Exemplar- vs. rule-based generalization has been studied
extensively in cognitive psychology, and, in this work, we present a protocol
inspired by these experimental approaches for directly probing this trade-off
in learning systems. The measures we propose characterize changes in
extrapolation behavior when feature coverage is manipulated in a combinatorial
setting. We present empirical results across a range of models and across both
expository and real-world image and language domains. We demonstrate that
measuring the exemplar-rule trade-off while controlling for feature-level bias
provides a more complete picture of extrapolation behavior than existing
formalisms. We find that most standard neural network models have a propensity
towards exemplar-based extrapolation and discuss the implications of these
findings for research on data augmentation, fairness, and systematic
generalization.
Related papers
- Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training [7.5041863920639456]
Machine learning systems often acquire biases by leveraging undesired features in the data, impacting accuracy across different sub-populations.
This paper explores the evolution of bias in a teacher-student setup modeling different data sub-populations with a Gaussian-mixture model.
Applying our findings to fairness and robustness, we delineate how and when heterogeneous data and spurious features can generate and amplify bias.
arXiv Detail & Related papers (2024-05-28T15:50:10Z) - Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs)
Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations.
Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - Prisoners of Their Own Devices: How Models Induce Data Bias in
Performative Prediction [4.874780144224057]
A biased model can make decisions that disproportionately harm certain groups in society.
Much work has been devoted to measuring unfairness in static ML environments, but not in dynamic, performative prediction ones.
We propose a taxonomy to characterize bias in the data, and study cases where it is shaped by model behaviour.
arXiv Detail & Related papers (2022-06-27T10:56:04Z) - Unsupervised Learning of Unbiased Visual Representations [10.871587311621974]
Deep neural networks are known for their inability to learn robust representations when biases exist in the dataset.
We propose a fully unsupervised debiasing framework, consisting of three steps.
We employ state-of-the-art supervised debiasing techniques to obtain an unbiased model.
arXiv Detail & Related papers (2022-04-26T10:51:50Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning Debiased Representation via Disentangled Feature Augmentation [19.348340314001756]
This paper presents an empirical analysis revealing that training with "diverse" bias-conflicting samples is crucial for debiasing.
We propose a novel feature-level data augmentation technique in order to synthesize diverse bias-conflicting samples.
arXiv Detail & Related papers (2021-07-03T08:03:25Z) - Evading the Simplicity Bias: Training a Diverse Set of Models Discovers
Solutions with Superior OOD Generalization [93.8373619657239]
Neural networks trained with SGD were recently shown to rely preferentially on linearly-predictive features.
This simplicity bias can explain their lack of robustness out of distribution (OOD)
We demonstrate that the simplicity bias can be mitigated and OOD generalization improved.
arXiv Detail & Related papers (2021-05-12T12:12:24Z) - The Role of Mutual Information in Variational Classifiers [47.10478919049443]
We study the generalization error of classifiers relying on encodings trained on the cross-entropy loss.
We derive bounds to the generalization error showing that there exists a regime where the generalization error is bounded by the mutual information.
arXiv Detail & Related papers (2020-10-22T12:27:57Z) - Learning from Failure: Training Debiased Classifier from Biased
Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge.
We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously.
Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.