Predictors from causal features do not generalize better to new domains
- URL: http://arxiv.org/abs/2402.09891v1
- Date: Thu, 15 Feb 2024 11:34:38 GMT
- Title: Predictors from causal features do not generalize better to new domains
- Authors: Vivian Y. Nastl and Moritz Hardt
- Abstract summary: We study how well machine learning models trained on causal features generalize across domains.
Our goal is to test the hypothesis that models trained on causal features generalize better across domains.
- Score: 18.95420918106124
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We study how well machine learning models trained on causal features
generalize across domains. We consider 16 prediction tasks on tabular datasets
covering applications in health, employment, education, social benefits, and
politics. Each dataset comes with multiple domains, allowing us to test how
well a model trained in one domain performs in another. For each prediction
task, we select features that have a causal influence on the target of
prediction. Our goal is to test the hypothesis that models trained on causal
features generalize better across domains. Without exception, we find that
predictors using all available features, regardless of causality, have better
in-domain and out-of-domain accuracy than predictors using causal features.
Moreover, even the absolute drop in accuracy from one domain to the other is no
better for causal predictors than for models that use all features. If the goal
is to generalize to new domains, practitioners might as well train the best
possible model on all available features.
Related papers
- ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Out-of-Domain Robustness via Targeted Augmentations [90.94290420322457]
We study principles for designing data augmentations for out-of-domain generalization.
Motivated by theoretical analysis on a linear setting, we propose targeted augmentations.
We show that targeted augmentations set new states-of-the-art for OOD performance by 3.2-15.2 percentage points.
arXiv Detail & Related papers (2023-02-23T08:59:56Z) - Outlier-Based Domain of Applicability Identification for Materials
Property Prediction Models [0.38073142980733]
We propose a method to find domains of applicability using a large feature space and also introduce analysis techniques to gain more insight into the detected domains.
In this work, we propose a method to find domains of applicability using a large feature space and also introduce analysis techniques to gain more insight into the detected domains.
arXiv Detail & Related papers (2023-01-17T07:51:12Z) - Rationalizing Predictions by Adversarial Information Calibration [65.19407304154177]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction.
We use an adversarial technique to calibrate the information extracted by the two models such that the difference between them is an indicator of the missed or over-selected features.
arXiv Detail & Related papers (2023-01-15T03:13:09Z) - Assessing Out-of-Domain Language Model Performance from Few Examples [38.245449474937914]
We address the task of predicting out-of-domain (OOD) performance in a few-shot fashion.
We benchmark the performance on this task when looking at model accuracy on the few-shot examples.
We show that attribution-based factors can help rank relative model OOD performance.
arXiv Detail & Related papers (2022-10-13T04:45:26Z) - Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - Out-of-Distribution Generalization Analysis via Influence Function [25.80365416547478]
The mismatch between training and target data is one major challenge for machine learning systems.
We introduce Influence Function, a classical tool from robust statistics, into the OOD generalization problem.
We show that the accuracy on test domains and the proposed index together can help us discern whether OOD algorithms are needed and whether a model achieves good OOD generalization.
arXiv Detail & Related papers (2021-01-21T09:59:55Z) - Learning from the Best: Rationalizing Prediction by Adversarial
Information Calibration [39.685626118667074]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction.
We use an adversarial-based technique to calibrate the information extracted by the two models.
For natural language tasks, we propose to use a language-model-based regularizer to encourage the extraction of fluent rationales.
arXiv Detail & Related papers (2020-12-16T11:54:15Z) - Batch Normalization Embeddings for Deep Domain Generalization [50.51405390150066]
Domain generalization aims at training machine learning models to perform robustly across different and unseen domains.
We show a significant increase in classification accuracy over current state-of-the-art techniques on popular domain generalization benchmarks.
arXiv Detail & Related papers (2020-11-25T12:02:57Z) - Adaptive Risk Minimization: Learning to Adapt to Domain Shift [109.87561509436016]
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
In this work, we consider the problem setting of domain generalization, where the training data are structured into domains and there may be multiple test time shifts.
We introduce the framework of adaptive risk minimization (ARM), in which models are directly optimized for effective adaptation to shift by learning to adapt on the training domains.
arXiv Detail & Related papers (2020-07-06T17:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.