Bringing the Algorithms to the Data -- Secure Distributed Medical
Analytics using the Personal Health Train (PHT-meDIC)
- URL: http://arxiv.org/abs/2212.03481v1
- Date: Wed, 7 Dec 2022 06:29:15 GMT
- Title: Bringing the Algorithms to the Data -- Secure Distributed Medical
Analytics using the Personal Health Train (PHT-meDIC)
- Authors: Marius de Arruda Botelho Herr, Michael Graf, Peter Placzek, Florian
K\"onig, Felix B\"otte, Tyra Stickel, David Hieber, Lukas Zimmermann, Michael
Slupina, Christopher Mohr, Stephanie Biergans, Mete Akg\"un, Nico Pfeifer,
Oliver Kohlbacher
- Abstract summary: Personal Health Train (PHT) paradigm implements an 'algorithm to the data' paradigm.
We present PHT-meDIC, a productively deployed open-source implementation of the PHT concept.
- Score: 1.451998131020241
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The need for data privacy and security -- enforced through increasingly
strict data protection regulations -- renders the use of healthcare data for
machine learning difficult. In particular, the transfer of data between
different hospitals is often not permissible and thus cross-site pooling of
data not an option. The Personal Health Train (PHT) paradigm proposed within
the GO-FAIR initiative implements an 'algorithm to the data' paradigm that
ensures that distributed data can be accessed for analysis without transferring
any sensitive data. We present PHT-meDIC, a productively deployed open-source
implementation of the PHT concept. Containerization allows us to easily deploy
even complex data analysis pipelines (e.g, genomics, image analysis) across
multiple sites in a secure and scalable manner. We discuss the underlying
technological concepts, security models, and governance processes. The
implementation has been successfully applied to distributed analyses of
large-scale data, including applications of deep neural networks to medical
image data.
Related papers
- Privacy-Preserving SAM Quantization for Efficient Edge Intelligence in Healthcare [9.381558154295012]
Segment Anything Model (SAM) excels in intelligent image segmentation.
SAM poses significant challenges for deployment on resource-limited edge devices.
We propose a data-free quantization framework for SAM, called DFQ-SAM, which learns and calibrates quantization parameters without any original data.
arXiv Detail & Related papers (2024-09-14T10:43:35Z) - An advanced data fabric architecture leveraging homomorphic encryption
and federated learning [10.779491433438144]
This paper introduces a secure approach for medical image analysis using federated learning and partially homomorphic encryption within a distributed data fabric architecture.
The study demonstrates the method's effectiveness through a case study on pituitary tumor classification, achieving a significant level of accuracy.
arXiv Detail & Related papers (2024-02-15T08:50:36Z) - Building Flexible, Scalable, and Machine Learning-ready Multimodal
Oncology Datasets [17.774341783844026]
This work proposes Multimodal Integration of Oncology Data System (MINDS)
MINDS is a flexible, scalable, and cost-effective metadata framework for efficiently fusing disparate data from public sources.
By harmonizing multimodal data, MINDS aims to potentially empower researchers with greater analytical ability.
arXiv Detail & Related papers (2023-09-30T15:44:39Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - Privacy-Preserving Medical Image Classification through Deep Learning
and Matrix Decomposition [0.0]
Deep learning (DL) solutions have been extensively researched in the medical domain in recent years.
The usage of health-related data is strictly regulated, processing medical records outside the hospital environment demands robust data protection measures.
In this paper, we use singular value decomposition (SVD) and principal component analysis (PCA) to obfuscate the medical images before employing them in the DL analysis.
The capability of DL algorithms to extract relevant information from secured data is assessed on a task of angiographic view classification based on obfuscated frames.
arXiv Detail & Related papers (2023-08-31T08:21:09Z) - Blockchain-empowered Federated Learning for Healthcare Metaverses:
User-centric Incentive Mechanism with Optimal Data Freshness [66.3982155172418]
We first design a user-centric privacy-preserving framework based on decentralized Federated Learning (FL) for healthcare metaverses.
We then utilize Age of Information (AoI) as an effective data-freshness metric and propose an AoI-based contract theory model under Prospect Theory (PT) to motivate sensing data sharing.
arXiv Detail & Related papers (2023-07-29T12:54:03Z) - Distributed sequential federated learning [0.0]
We develop a data-driven method for efficiently and effectively aggregating valued information by analyzing local data.
We use numerical studies of simulated data and an application to COVID-19 data collected from 32 hospitals in Mexico.
arXiv Detail & Related papers (2023-01-31T21:20:45Z) - Sensitivity analysis in differentially private machine learning using
hybrid automatic differentiation [54.88777449903538]
We introduce a novel textithybrid automatic differentiation (AD) system for sensitivity analysis.
This enables modelling the sensitivity of arbitrary differentiable function compositions, such as the training of neural networks on private data.
Our approach can enable the principled reasoning about privacy loss in the setting of data processing.
arXiv Detail & Related papers (2021-07-09T07:19:23Z) - FLOP: Federated Learning on Medical Datasets using Partial Networks [84.54663831520853]
COVID-19 Disease due to the novel coronavirus has caused a shortage of medical resources.
Different data-driven deep learning models have been developed to mitigate the diagnosis of COVID-19.
The data itself is still scarce due to patient privacy concerns.
We propose a simple yet effective algorithm, named textbfFederated textbfL textbfon Medical datasets using textbfPartial Networks (FLOP)
arXiv Detail & Related papers (2021-02-10T01:56:58Z) - Privacy-preserving medical image analysis [53.4844489668116]
We present PriMIA, a software framework designed for privacy-preserving machine learning (PPML) in medical imaging.
We show significantly better classification performance of a securely aggregated federated learning model compared to human experts on unseen datasets.
We empirically evaluate the framework's security against a gradient-based model inversion attack.
arXiv Detail & Related papers (2020-12-10T13:56:00Z) - GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially
Private Generators [74.16405337436213]
We propose Gradient-sanitized Wasserstein Generative Adrial Networks (GS-WGAN)
GS-WGAN allows releasing a sanitized form of sensitive data with rigorous privacy guarantees.
We find our approach consistently outperforms state-of-the-art approaches across multiple metrics.
arXiv Detail & Related papers (2020-06-15T10:01:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.