Addressing catastrophic forgetting for medical domain expansion
- URL: http://arxiv.org/abs/2103.13511v1
- Date: Wed, 24 Mar 2021 22:33:38 GMT
- Title: Addressing catastrophic forgetting for medical domain expansion
- Authors: Sharut Gupta, Praveer Singh, Ken Chang, Liangqiong Qu, Mehak Aggarwal,
Nishanth Arun, Ashwin Vaswani, Shruti Raghavan, Vibha Agarwal, Mishka
Gidwani, Katharina Hoebel, Jay Patel, Charles Lu, Christopher P. Bridge,
Daniel L. Rubin, Jayashree Kalpathy-Cramer
- Abstract summary: Model brittleness is a key concern when deploying deep learning models in real-world medical settings.
A model that has high performance at one institution may suffer a significant decline in performance when tested at other institutions.
We develop an approach to address catastrophic forget-ting based on elastic weight consolidation combined with modulation of batch normalization statistics.
- Score: 9.720534481714953
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model brittleness is a key concern when deploying deep learning models in
real-world medical settings. A model that has high performance at one
institution may suffer a significant decline in performance when tested at
other institutions. While pooling datasets from multiple institutions and
retraining may provide a straightforward solution, it is often infeasible and
may compromise patient privacy. An alternative approach is to fine-tune the
model on subsequent institutions after training on the original institution.
Notably, this approach degrades model performance at the original institution,
a phenomenon known as catastrophic forgetting. In this paper, we develop an
approach to address catastrophic forget-ting based on elastic weight
consolidation combined with modulation of batch normalization statistics under
two scenarios: first, for expanding the domain from one imaging system's data
to another imaging system's, and second, for expanding the domain from a large
multi-institutional dataset to another single institution dataset. We show that
our approach outperforms several other state-of-the-art approaches and provide
theoretical justification for the efficacy of batch normalization modulation.
The results of this study are generally applicable to the deployment of any
clinical deep learning model which requires domain expansion.
Related papers
- LoRKD: Low-Rank Knowledge Decomposition for Medical Foundation Models [59.961172635689664]
"Knowledge Decomposition" aims to improve the performance on specific medical tasks.
We propose a novel framework named Low-Rank Knowledge Decomposition (LoRKD)
LoRKD explicitly separates gradients from different tasks by incorporating low-rank expert modules and efficient knowledge separation convolution.
arXiv Detail & Related papers (2024-09-29T03:56:21Z) - Rethinking Model Prototyping through the MedMNIST+ Dataset Collection [0.11999555634662634]
This work presents a benchmark for the MedMNIST+ database to diversify the evaluation landscape.
We conduct a thorough analysis of common convolutional neural networks (CNNs) and Transformer-based architectures, for medical image classification.
Our findings suggest that computationally efficient training schemes and modern foundation models hold promise in bridging the gap between expensive end-to-end training and more resource-refined approaches.
arXiv Detail & Related papers (2024-04-24T10:19:25Z) - Incremental Learning for Heterogeneous Structure Segmentation in Brain
Tumor MRI [11.314017805825685]
We propose a divergence-aware dual-flow module with balanced rigidity and plasticity branches to decouple old and new tasks.
We evaluate our framework on a brain tumor segmentation task with continually changing target domains.
arXiv Detail & Related papers (2023-05-30T20:39:03Z) - Distillation to Enhance the Portability of Risk Models Across
Institutions with Large Patient Claims Database [12.452703677540505]
We investigate the practicality of model portability through a cross-site evaluation of readmission prediction models.
We apply a recurrent neural network, augmented with self-attention and blended with expert features, to build readmission prediction models.
Our experiments show that direct application of ML models trained at one institution and tested at another institution perform worse than models trained and tested at the same institution.
arXiv Detail & Related papers (2022-07-06T05:26:32Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - Practical Challenges in Differentially-Private Federated Survival
Analysis of Medical Data [57.19441629270029]
In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of survival analysis models.
In the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge.
We propose DPFed-post which adds a post-processing stage to the private federated learning scheme.
arXiv Detail & Related papers (2022-02-08T10:03:24Z) - Source-Free Open Compound Domain Adaptation in Semantic Segmentation [99.82890571842603]
In SF-OCDA, only the source pre-trained model and the target data are available to learn the target model.
We propose the Cross-Patch Style Swap (CPSS) to diversify samples with various patch styles in the feature-level.
Our method produces state-of-the-art results on the C-Driving dataset.
arXiv Detail & Related papers (2021-06-07T08:38:41Z) - A Twin Neural Model for Uplift [59.38563723706796]
Uplift is a particular case of conditional treatment effect modeling.
We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk.
We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
arXiv Detail & Related papers (2021-05-11T16:02:39Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z) - The unreasonable effectiveness of Batch-Norm statistics in addressing
catastrophic forgetting across medical institutions [8.244654685687054]
We investigate trade-off between model refinement and retention of previously learned knowledge.
We propose a simple yet effective approach, adapting Elastic weight consolidation (EWC) using the global batch normalization statistics of the original dataset.
arXiv Detail & Related papers (2020-11-16T16:57:05Z) - Multi-site fMRI Analysis Using Privacy-preserving Federated Learning and
Domain Adaptation: ABIDE Results [13.615292855384729]
To train a high-quality deep learning model, the aggregation of a significant amount of patient information is required.
Due to the need to protect the privacy of patient data, it is hard to assemble a central database from multiple institutions.
Federated learning allows for population-level models to be trained without centralizing entities' data.
arXiv Detail & Related papers (2020-01-16T04:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.