Medical Unlearnable Examples: Securing Medical Data from Unauthorized Training via Sparsity-Aware Local Masking
- URL: http://arxiv.org/abs/2403.10573v2
- Date: Sun, 7 Jul 2024 13:36:22 GMT
- Title: Medical Unlearnable Examples: Securing Medical Data from Unauthorized Training via Sparsity-Aware Local Masking
- Authors: Weixiang Sun, Yixin Liu, Zhiling Yan, Kaidi Xu, Lichao Sun,
- Abstract summary: Fears of unauthorized use, like training commercial AI models, hinder researchers from sharing their valuable datasets.
We propose the Sparsity-Aware Local Masking (SALM) method, which selectively perturbs significant pixel regions rather than the entire image.
Our experiments demonstrate that SALM effectively prevents unauthorized training of different models and outperforms previous SoTA data protection methods.
- Score: 24.850260039814774
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The rapid expansion of AI in healthcare has led to a surge in medical data generation and storage, boosting medical AI development. However, fears of unauthorized use, like training commercial AI models, hinder researchers from sharing their valuable datasets. To encourage data sharing, one promising solution is to introduce imperceptible noise into the data. This method aims to safeguard the data against unauthorized training by inducing degradation in the generalization ability of the trained model. However, they are not effective and efficient when applied to medical data, mainly due to the ignorance of the sparse nature of medical images. To address this problem, we propose the Sparsity-Aware Local Masking (SALM) method, a novel approach that selectively perturbs significant pixel regions rather than the entire image as previously. This simple yet effective approach, by focusing on local areas, significantly narrows down the search space for disturbances and fully leverages the characteristics of sparsity. Our extensive experiments across various datasets and model architectures demonstrate that SALM effectively prevents unauthorized training of different models and outperforms previous SoTA data protection methods.
Related papers
- FedDP: Privacy-preserving method based on federated learning for histopathology image segmentation [2.864354559973703]
This paper addresses the dispersed nature and privacy sensitivity of medical image data by employing a federated learning framework.
The proposed method, FedDP, minimally impacts model accuracy while effectively safeguarding the privacy of cancer pathology image data.
arXiv Detail & Related papers (2024-11-07T08:02:58Z) - Improving the Classification Effect of Clinical Images of Diseases for Multi-Source Privacy Protection [0.0]
Privacy data protection in the medical field poses challenges to data sharing.
Traditional centralized training methods are difficult to apply due to violations of privacy protection principles.
We propose a medical privacy data training framework based on data vectors.
arXiv Detail & Related papers (2024-08-23T12:52:24Z) - SMART: Towards Pre-trained Missing-Aware Model for Patient Health Status Prediction [15.136747790595217]
We propose a Self-Supervised Missing-Aware RepresenTation Learning approach for patient health status prediction.
By adopting missing-aware attentions and focusing on learning higher-order representations, SMART promotes better generalization and robustness to missing data.
We validate the effectiveness of SMART through extensive experiments on six EHR tasks, demonstrating its superiority over state-of-the-art methods.
arXiv Detail & Related papers (2024-05-15T02:19:34Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic
Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment.
The scarcity of annotated data limits the effectiveness and generalization of existing methods.
We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z) - Privacy-Preserving Medical Image Classification through Deep Learning
and Matrix Decomposition [0.0]
Deep learning (DL) solutions have been extensively researched in the medical domain in recent years.
The usage of health-related data is strictly regulated, processing medical records outside the hospital environment demands robust data protection measures.
In this paper, we use singular value decomposition (SVD) and principal component analysis (PCA) to obfuscate the medical images before employing them in the DL analysis.
The capability of DL algorithms to extract relevant information from secured data is assessed on a task of angiographic view classification based on obfuscated frames.
arXiv Detail & Related papers (2023-08-31T08:21:09Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Decentralized Distributed Learning with Privacy-Preserving Data
Synthesis [9.276097219140073]
In the medical field, multi-center collaborations are often sought to yield more generalizable findings by leveraging the heterogeneity of patient and clinical data.
Recent privacy regulations hinder the possibility to share data, and consequently, to come up with machine learning-based solutions that support diagnosis and prognosis.
We present a decentralized distributed method that integrates features from local nodes, providing models able to generalize across multiple datasets while maintaining privacy.
arXiv Detail & Related papers (2022-06-20T23:49:38Z) - When Accuracy Meets Privacy: Two-Stage Federated Transfer Learning
Framework in Classification of Medical Images on Limited Data: A COVID-19
Case Study [77.34726150561087]
COVID-19 pandemic has spread rapidly and caused a shortage of global medical resources.
CNN has been widely utilized and verified in analyzing medical images.
arXiv Detail & Related papers (2022-03-24T02:09:41Z) - Practical Challenges in Differentially-Private Federated Survival
Analysis of Medical Data [57.19441629270029]
In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of survival analysis models.
In the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge.
We propose DPFed-post which adds a post-processing stage to the private federated learning scheme.
arXiv Detail & Related papers (2022-02-08T10:03:24Z) - FLOP: Federated Learning on Medical Datasets using Partial Networks [84.54663831520853]
COVID-19 Disease due to the novel coronavirus has caused a shortage of medical resources.
Different data-driven deep learning models have been developed to mitigate the diagnosis of COVID-19.
The data itself is still scarce due to patient privacy concerns.
We propose a simple yet effective algorithm, named textbfFederated textbfL textbfon Medical datasets using textbfPartial Networks (FLOP)
arXiv Detail & Related papers (2021-02-10T01:56:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.