Backdoor Adjustment of Confounding by Provenance for Robust Text
Classification of Multi-institutional Clinical Notes
- URL: http://arxiv.org/abs/2310.02451v1
- Date: Tue, 3 Oct 2023 21:40:44 GMT
- Title: Backdoor Adjustment of Confounding by Provenance for Robust Text
Classification of Multi-institutional Clinical Notes
- Authors: Xiruo Ding, Zhecheng Sheng, Meliha Yeti\c{s}gen, Serguei Pakhomov,
Trevor Cohen
- Abstract summary: We evaluate the utility of backdoor adjustment for text classification in a dataset of clinical notes annotated for mentions of substance abuse.
Our results indicate that backdoor adjustment can effectively mitigate for confounding shift.
- Score: 7.82404526462678
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural Language Processing (NLP) methods have been broadly applied to
clinical tasks. Machine learning and deep learning approaches have been used to
improve the performance of clinical NLP. However, these approaches require
sufficiently large datasets for training, and trained models have been shown to
transfer poorly across sites. These issues have led to the promotion of data
collection and integration across different institutions for accurate and
portable models. However, this can introduce a form of bias called confounding
by provenance. When source-specific data distributions differ at deployment,
this may harm model performance. To address this issue, we evaluate the utility
of backdoor adjustment for text classification in a multi-site dataset of
clinical notes annotated for mentions of substance abuse. Using an evaluation
framework devised to measure robustness to distributional shifts, we assess the
utility of backdoor adjustment. Our results indicate that backdoor adjustment
can effectively mitigate for confounding shift.
Related papers
- Granularity Matters in Long-Tail Learning [62.30734737735273]
We offer a novel perspective on long-tail learning, inspired by an observation: datasets with finer granularity tend to be less affected by data imbalance.
We introduce open-set auxiliary classes that are visually similar to existing ones, aiming to enhance representation learning for both head and tail classes.
To prevent the overwhelming presence of auxiliary classes from disrupting training, we introduce a neighbor-silencing loss.
arXiv Detail & Related papers (2024-10-21T13:06:21Z) - SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - Virtually Objective Quantification of in vitro Wound Healing Scratch Assays with the Segment Anything Model [0.19999259391104385]
In vitro scratch assay is widely used in cell biology to assess the rate of wound closure.
In this paper, we make use of the segment anything model, a deep foundation model based on interactive point-prompts.
Results attested very low intra- and interobserver variability, even compared to manual segmentation of domain experts.
arXiv Detail & Related papers (2024-07-02T11:45:56Z) - High Dimensional Causal Inference with Variational Backdoor Adjustment [57.31312942774617]
We take a generative modeling approach to backdoor adjustment for high dimensional treatments and confounders.
Our method is able to estimate interventional likelihood in a variety of high dimensional settings, including semi-synthetic X-ray medical data.
arXiv Detail & Related papers (2023-10-09T19:21:41Z) - Improving Multiple Sclerosis Lesion Segmentation Across Clinical Sites:
A Federated Learning Approach with Noise-Resilient Training [75.40980802817349]
Deep learning models have shown promise for automatically segmenting MS lesions, but the scarcity of accurately annotated data hinders progress in this area.
We introduce a Decoupled Hard Label Correction (DHLC) strategy that considers the imbalanced distribution and fuzzy boundaries of MS lesions.
We also introduce a Centrally Enhanced Label Correction (CELC) strategy, which leverages the aggregated central model as a correction teacher for all sites.
arXiv Detail & Related papers (2023-08-31T00:36:10Z) - Taxonomy Adaptive Cross-Domain Adaptation in Medical Imaging via
Optimization Trajectory Distillation [73.83178465971552]
The success of automated medical image analysis depends on large-scale and expert-annotated training sets.
Unsupervised domain adaptation (UDA) has been raised as a promising approach to alleviate the burden of labeled data collection.
We propose optimization trajectory distillation, a unified approach to address the two technical challenges from a new perspective.
arXiv Detail & Related papers (2023-07-27T08:58:05Z) - Maximizing Model Generalization for Machine Condition Monitoring with
Self-Supervised Learning and Federated Learning [4.214064911004321]
Deep Learning can diagnose faults and assess machine health from raw condition monitoring data without manually designed statistical features.
Traditional supervised learning may struggle to learn compact, discriminative representations that generalize to unseen target domains.
This study proposes focusing on maximizing the feature generality on the source domain and applying TL via weight transfer to copy the model to the target domain.
arXiv Detail & Related papers (2023-04-27T17:57:54Z) - Learning Neural Models for Natural Language Processing in the Face of
Distributional Shift [10.990447273771592]
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications.
It builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time.
This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information.
It is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime
arXiv Detail & Related papers (2021-09-03T14:29:20Z) - Estimating Redundancy in Clinical Text [6.245180523143739]
Clinicians populate new documents by duplicating existing notes, then updating accordingly.
quantifying information redundancy can play an essential role in evaluating innovations that operate on clinical narratives.
We present and evaluate two strategies to measure redundancy: an information-theoretic approach and a lexicosyntactic and semantic model.
arXiv Detail & Related papers (2021-05-25T11:01:45Z) - The unreasonable effectiveness of Batch-Norm statistics in addressing
catastrophic forgetting across medical institutions [8.244654685687054]
We investigate trade-off between model refinement and retention of previously learned knowledge.
We propose a simple yet effective approach, adapting Elastic weight consolidation (EWC) using the global batch normalization statistics of the original dataset.
arXiv Detail & Related papers (2020-11-16T16:57:05Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.