Principles from Clinical Research for NLP Model Generalization
- URL: http://arxiv.org/abs/2311.03663v3
- Date: Tue, 2 Apr 2024 02:27:12 GMT
- Title: Principles from Clinical Research for NLP Model Generalization
- Authors: Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor,
- Abstract summary: We explore the foundations of generalizability and study the factors that affect it.
We demonstrate how learning spurious correlations, such as the distance between entities in relation extraction tasks, can affect a model's internal validity.
- Score: 10.985226652193543
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The NLP community typically relies on performance of a model on a held-out test set to assess generalization. Performance drops observed in datasets outside of official test sets are generally attributed to "out-of-distribution" effects. Here, we explore the foundations of generalizability and study the factors that affect it, articulating lessons from clinical studies. In clinical research, generalizability is an act of reasoning that depends on (a) internal validity of experiments to ensure controlled measurement of cause and effect, and (b) external validity or transportability of the results to the wider population. We demonstrate how learning spurious correlations, such as the distance between entities in relation extraction tasks, can affect a model's internal validity and in turn adversely impact generalization. We, therefore, present the need to ensure internal validity when building machine learning models in NLP. Our recommendations also apply to generative large language models, as they are known to be sensitive to even minor semantic preserving alterations. We also propose adapting the idea of matching in randomized controlled trials and observational studies to NLP evaluation to measure causation.
Related papers
- Mixstyle-Entropy: Domain Generalization with Causal Intervention and Perturbation [38.97031630265987]
Domain generalization (DG) solves this issue by learning representations independent of domain-related information, thus facilitating extrapolation to unseen environments.
Existing approaches typically focus on formulating tailored training objectives to extract shared features from the source data.
We propose a novel framework based on causality, named InPer, designed to enhance model generalization by incorporating causal intervention during training and causal perturbation during testing.
arXiv Detail & Related papers (2024-08-07T07:54:19Z) - Estimating Causal Effects with Double Machine Learning -- A Method Evaluation [5.904095466127043]
We review one of the most prominent methods - "double/debiased machine learning" (DML)
Our findings indicate that the application of a suitably flexible machine learning algorithm within DML improves the adjustment for various nonlinear confounding relationships.
When estimating the effects of air pollution on housing prices, we find that DML estimates are consistently larger than estimates of less flexible methods.
arXiv Detail & Related papers (2024-03-21T13:21:33Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Understanding Robust Overfitting from the Feature Generalization Perspective [61.770805867606796]
Adversarial training (AT) constructs robust neural networks by incorporating adversarial perturbations into natural data.
It is plagued by the issue of robust overfitting (RO), which severely damages the model's robustness.
In this paper, we investigate RO from a novel feature generalization perspective.
arXiv Detail & Related papers (2023-10-01T07:57:03Z) - Cross-functional Analysis of Generalisation in Behavioural Learning [4.0810783261728565]
We introduce BeLUGA, an analysis method for evaluating behavioural learning considering generalisation across dimensions of different levels.
An aggregate score measures generalisation to unseen functionalities (or overfitting)
arXiv Detail & Related papers (2023-05-22T11:54:19Z) - A Causal Inference Framework for Leveraging External Controls in Hybrid
Trials [1.7942265700058988]
We consider the challenges associated with causal inference in settings where data from a randomized trial is augmented with control data from an external source.
We propose estimators, review efficiency bounds, and an approach for efficient doubly-robust estimation.
We apply the framework to a trial investigating the effect of risdisplam on motor function in patients with spinal muscular atrophy.
arXiv Detail & Related papers (2023-05-15T19:15:32Z) - Causal Inference via Nonlinear Variable Decorrelation for Healthcare
Applications [60.26261850082012]
We introduce a novel method with a variable decorrelation regularizer to handle both linear and nonlinear confounding.
We employ association rules as new representations using association rule mining based on the original features to increase model interpretability.
arXiv Detail & Related papers (2022-09-29T17:44:14Z) - Confounder Identification-free Causal Visual Feature Learning [84.28462256571822]
We propose a novel Confounder Identification-free Causal Visual Feature Learning (CICF) method, which obviates the need for identifying confounders.
CICF models the interventions among different samples based on front-door criterion, and then approximates the global-scope intervening effect upon the instance-level interventions.
We uncover the relation between CICF and the popular meta-learning strategy MAML, and provide an interpretation of why MAML works from the theoretical perspective.
arXiv Detail & Related papers (2021-11-26T10:57:47Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Active Invariant Causal Prediction: Experiment Selection through
Stability [4.56877715768796]
In this work we propose a new active learning (i.e. experiment selection) framework (A-ICP) based on Invariant Causal Prediction (ICP)
For general structural causal models, we characterize the effect of interventions on so-called stable sets.
We propose several intervention selection policies for A-ICP which quickly reveal the direct causes of a response variable in the causal graph.
Empirically, we analyze the performance of the proposed policies in both population and finite-regime experiments.
arXiv Detail & Related papers (2020-06-10T07:07:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.