Related papers: Multicenter Privacy-Preserving Model Training for Deep Learning Brain Metastases Autosegmentation

Multicenter Privacy-Preserving Model Training for Deep Learning Brain Metastases Autosegmentation

URL: http://arxiv.org/abs/2405.10870v2
Date: Thu, 25 Jul 2024 09:51:06 GMT
Title: Multicenter Privacy-Preserving Model Training for Deep Learning Brain Metastases Autosegmentation
Authors: Yixing Huang, Zahra Khodabakhshi, Ahmed Gomaa, Manuel Schmidt, Rainer Fietkau, Matthias Guckenberger, Nicolaus Andratschke, Christoph Bert, Stephanie Tanadini-Lang, Florian Putz,
Abstract summary: This work aims to explore the impact of multicenter data heterogeneity on deep learning brain metastases (BM) autosegmentation performance. incremental transfer learning technique, namely learning without forgetting (LWF), to improve model generalizability without sharing raw data. When the UKER pretrained model is applied to USZ, LWF achieves a higher average F1 score (0.839) than naive TL (0.570) and single-center training (0.688) on combined UKER and USZ test data.
Score: 2.479757014250359
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Objectives: This work aims to explore the impact of multicenter data heterogeneity on deep learning brain metastases (BM) autosegmentation performance, and assess the efficacy of an incremental transfer learning technique, namely learning without forgetting (LWF), to improve model generalizability without sharing raw data. Materials and methods: A total of six BM datasets from University Hospital Erlangen (UKER), University Hospital Zurich (USZ), Stanford, UCSF, NYU and BraTS Challenge 2023 on BM segmentation were used for this evaluation. First, the multicenter performance of a convolutional neural network (DeepMedic) for BM autosegmentation was established for exclusive single-center training and for training on pooled data, respectively. Subsequently bilateral collaboration was evaluated, where a UKER pretrained model is shared to another center for further training using transfer learning (TL) either with or without LWF. Results: For single-center training, average F1 scores of BM detection range from 0.625 (NYU) to 0.876 (UKER) on respective single-center test data. Mixed multicenter training notably improves F1 scores at Stanford and NYU, with negligible improvement at other centers. When the UKER pretrained model is applied to USZ, LWF achieves a higher average F1 score (0.839) than naive TL (0.570) and single-center training (0.688) on combined UKER and USZ test data. Naive TL improves sensitivity and contouring accuracy, but compromises precision. Conversely, LWF demonstrates commendable sensitivity, precision and contouring accuracy. When applied to Stanford, similar performance was observed. Conclusion: Data heterogeneity results in varying performance in BM autosegmentation, posing challenges to model generalizability. LWF is a promising approach to peer-to-peer privacy-preserving model training.

Related papers

A Federated and Parameter-Efficient Framework for Large Language Model Training in Medicine [59.78991974851707]
Large language models (LLMs) have demonstrated strong performance on medical benchmarks, including question answering and diagnosis.<n>Most medical LLMs are trained on data from a single institution, which faces limitations in generalizability and safety in heterogeneous systems.<n>We introduce the model-agnostic and parameter-efficient federated learning framework for adapting LLMs to medical applications.
arXiv Detail & Related papers (2026-01-29T18:48:21Z)
FUGC: Benchmarking Semi-Supervised Learning Methods for Cervical Segmentation [63.7829089874007]
This paper introduces the Fetal Ultrasound Grand Challenge (FUGC), the first benchmark for semi-supervised learning in cervical segmentation.<n>FUGC provides a dataset of 890 TVS images, including 500 training images, 90 validation images, and 300 test images.<n> Methods were evaluated using the Dice Similarity Coefficient (DSC), Hausdorff Distance (HD), and runtime (RT), with a weighted combination of 0.4/0.4/0.2.
arXiv Detail & Related papers (2026-01-22T01:34:39Z)
Explainable Admission-Level Predictive Modeling for Prolonged Hospital Stay in Elderly Populations: Challenges in Low- and Middle-Income Countries [65.4286079244589]
Prolonged length of stay (pLoS) is a significant factor associated with the risk of adverse in-hospital events.<n>We develop and explain a predictive model for pLos using admission-level patient and hospital administrative data.
arXiv Detail & Related papers (2026-01-07T23:35:24Z)
Studying Various Activation Functions and Non-IID Data for Machine Learning Model Robustness [3.641683644638084]
We study the machine learning (ML) model robustness using ten different activation functions through adversarial training.<n>Our proposed centralized adversarial training approach achieves a natural and robust accuracy of 77.08% and 67.96%.<n>In the federated learning environment, however, the robust accuracy decreases significantly, especially on non-IID data.
arXiv Detail & Related papers (2025-12-03T21:03:45Z)
Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning [77.120955854093]
We show that data diversity can be a strong predictor of generalization in language models.<n>We introduce G-Vendi, a metric that quantifies diversity via the entropy of model-induced gradients.<n>We present Prismatic Synthesis, a framework for generating diverse synthetic data.
arXiv Detail & Related papers (2025-05-26T16:05:10Z)
Predicting Length of Stay in Neurological ICU Patients Using Classical Machine Learning and Neural Network Models: A Benchmark Study on MIMIC-IV [49.1574468325115]
This study explores multiple ML approaches for predicting LOS in ICU specifically for the patients with neurological diseases based on the MIMIC-IV dataset.<n>The evaluated models include classic ML algorithms (K-Nearest Neighbors, Random Forest, XGBoost and CatBoost) and Neural Networks (LSTM, BERT and Temporal Fusion Transformer)
arXiv Detail & Related papers (2025-05-23T14:06:42Z)
Election of Collaborators via Reinforcement Learning for Federated Brain Tumor Segmentation [0.1969973131266619]
Federated learning (FL) enables collaborative model training across decentralized datasets. We present RL-HSimAgg, a novel reinforcement learning (RL) and similarity-weighted aggregation (simAgg) algorithm. We demonstrate the effectiveness of Epsilon-greedy (EG) and upper confidence bound (UCB) algorithms for federated brain lesion segmentation.
arXiv Detail & Related papers (2024-12-28T19:54:06Z)
Probing Perfection: The Relentless Art of Meddling for Pulmonary Airway Segmentation from HRCT via a Human-AI Collaboration Based Active Learning Method [13.384578466263566]
In pulmonary tracheal segmentation, the scarcity of annotated data is a prevalent issue. Deep Learning (DL) methods face challenges: the opacity of 'black box' models and the need for performance enhancement. We address these challenges by combining diverse query strategies with various DL models.
arXiv Detail & Related papers (2024-07-03T23:27:53Z)
Federated learning model for predicting major postoperative complications [2.565552377354702]
We developed federated learning models to predict nine major postoperative complications. We compared federated learning models with local learning models trained on a single site and central learning models trained on pooled dataset from two centers. Our federated learning model obtained comparable performance to the best local learning model at each center, demonstrating strong generalizability.
arXiv Detail & Related papers (2024-04-09T22:31:10Z)
How to Train Data-Efficient LLMs [56.41105687693619]
We study data-efficient approaches for pre-training language models (LLMs) We find that Ask-LLM and Density sampling are the best methods in their respective categories. In our comparison of 19 samplers, involving hundreds of evaluation tasks and pre-training runs, we find that Ask-LLM and Density are the best methods in their respective categories.
arXiv Detail & Related papers (2024-02-15T02:27:57Z)
A Federated Learning Framework for Stenosis Detection [70.27581181445329]
This study explores the use of Federated Learning (FL) for stenosis detection in coronary angiography images (CA) Two heterogeneous datasets from two institutions were considered: dataset 1 includes 1219 images from 200 patients, which we acquired at the Ospedale Riuniti of Ancona (Italy) dataset 2 includes 7492 sequential images from 90 patients from a previous study available in the literature.
arXiv Detail & Related papers (2023-10-30T11:13:40Z)
Effect of Intensity Standardization on Deep Learning for WML Segmentation in Multi-Centre FLAIR MRI [0.06117371161379209]
Deep learning (DL) methods for white matter lesion (WML) segmentation in MRI suffer a reduction in performance when applied on data from a scanner or centre that is out-of-distribution (OOD) from the training data. This is critical for translation and widescale adoption, since current models cannot be readily applied to data from new institutions. We evaluate several intensity standardization methods for MRI as a preprocessing step for WML segmentation in multi-centre Fluid-Attenuated Inversion Recovery (FLAIR) MRI.
arXiv Detail & Related papers (2023-07-07T20:51:38Z)
Robust Learning with Progressive Data Expansion Against Spurious Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features. Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process. We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z)
Continual Learning for Peer-to-Peer Federated Learning: A Study on Automated Brain Metastasis Identification [8.071094228545297]
Continual learning, as one approach to peer-to-peer federated learning, can promote multicenter collaboration on deep learning algorithm development. Our experiments demonstrate that continual learning can improve brain metastasis identification performance for centers with limited data.
arXiv Detail & Related papers (2022-04-26T20:17:36Z)
Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model. We introduce two unique positive sampling strategies specifically tailored for EHR data. Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)
Deep learning-based COVID-19 pneumonia classification using chest CT images: model generalizability [54.86482395312936]
Deep learning (DL) classification models were trained to identify COVID-19-positive patients on 3D computed tomography (CT) datasets from different countries. We trained nine identical DL-based classification models by using combinations of the datasets with a 72% train, 8% validation, and 20% test data split. The models trained on multiple datasets and evaluated on a test set from one of the datasets used for training performed better.
arXiv Detail & Related papers (2021-02-18T21:14:52Z)
All Data Inclusive, Deep Learning Models to Predict Critical Events in the Medical Information Mart for Intensive Care III Database (MIMIC III) [0.0]
This study was performed using 42,818 hospital admissions involving 35,348 patients. Over 75 million events across multiple data sources were processed, resulting in over 355 million tokens. It is possible to predict in-hospital mortality with much better confidence and higher reliability from models built using all sources of data.
arXiv Detail & Related papers (2020-09-02T22:12:18Z)
Deep F-measure Maximization for End-to-End Speech Understanding [52.36496114728355]
We propose a differentiable approximation to the F-measure and train the network with this objective using standard backpropagation. We perform experiments on two standard fairness datasets, Adult, Communities and Crime, and also on speech-to-intent detection on the ATIS dataset and speech-to-image concept classification on the Speech-COCO dataset. In all four of these tasks, F-measure results in improved micro-F1 scores, with absolute improvements of up to 8% absolute, as compared to models trained with the cross-entropy loss function.
arXiv Detail & Related papers (2020-08-08T03:02:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.