Securing Biomedical Images from Unauthorized Training with Anti-Learning
Perturbation
- URL: http://arxiv.org/abs/2303.02559v1
- Date: Sun, 5 Mar 2023 03:09:03 GMT
- Title: Securing Biomedical Images from Unauthorized Training with Anti-Learning
Perturbation
- Authors: Yixin Liu, Haohui Ye, Kai Zhang, Lichao Sun
- Abstract summary: We propose a novel approach termed unlearnable biomedical image' for protecting biomedical data by injecting imperceptible but delusive noises into the data.
Our method is an important step toward encouraging more institutions to contribute their data for the long-term development of the research community.
- Score: 26.81914618642174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The volume of open-source biomedical data has been essential to the
development of various spheres of the healthcare community since more `free'
data can provide individual researchers more chances to contribute. However,
institutions often hesitate to share their data with the public due to the risk
of data exploitation by unauthorized third parties for another commercial usage
(e.g., training AI models). This phenomenon might hinder the development of the
whole healthcare research community. To address this concern, we propose a
novel approach termed `unlearnable biomedical image' for protecting biomedical
data by injecting imperceptible but delusive noises into the data, making them
unexploitable for AI models. We formulate the problem as a bi-level
optimization and propose three kinds of anti-learning perturbation generation
approaches to solve the problem. Our method is an important step toward
encouraging more institutions to contribute their data for the long-term
development of the research community.
Related papers
- Privacy-Preserving Collaborative Genomic Research: A Real-Life Deployment and Vision [2.7968600664591983]
This paper presents a privacy-preserving framework for genomic research, developed in collaboration with Lynx.MD.
The framework addresses critical cybersecurity and privacy challenges, enabling the privacy-preserving sharing and analysis of genomic data.
Implementing the framework within Lynx.MD involves encoding genomic data into binary formats and applying noise through controlled perturbation techniques.
arXiv Detail & Related papers (2024-07-12T05:43:13Z) - Generative AI for Secure and Privacy-Preserving Mobile Crowdsensing [74.58071278710896]
generative AI has attracted much attention from both academic and industrial fields.
Secure and privacy-preserving mobile crowdsensing (SPPMCS) has been widely applied in data collection/ acquirement.
arXiv Detail & Related papers (2024-05-17T04:00:58Z) - A Survey of Few-Shot Learning for Biomedical Time Series [3.845248204742053]
Data-driven models have tremendous potential to assist clinical diagnosis and improve patient care.
An emerging approach to overcome the scarcity of labeled data is to augment AI methods with human-like capabilities to learn new tasks with limited examples, called few-shot learning.
This survey provides a comprehensive review and comparison of few-shot learning methods for biomedical time series applications.
arXiv Detail & Related papers (2024-05-03T21:22:27Z) - Medical Unlearnable Examples: Securing Medical Data from Unauthorized Training via Sparsity-Aware Local Masking [24.850260039814774]
Fears of unauthorized use, like training commercial AI models, hinder researchers from sharing their valuable datasets.
We propose the Sparsity-Aware Local Masking (SALM) method, which selectively perturbs significant pixel regions rather than the entire image.
Our experiments demonstrate that SALM effectively prevents unauthorized training of different models and outperforms previous SoTA data protection methods.
arXiv Detail & Related papers (2024-03-15T02:35:36Z) - Balancing Privacy and Progress in Artificial Intelligence: Anonymization
in Histopathology for Biomedical Research and Education [1.8078387709049526]
Transferring medical data "as open as possible" poses a risk to patient privacy.
Existing regulations push towards keeping medical data "as closed as necessary" to avoid re-identification risks.
This paper explores the legal regulations and terminologies for medical data-sharing.
arXiv Detail & Related papers (2023-07-18T16:53:07Z) - BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks.
We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Practical Challenges in Differentially-Private Federated Survival
Analysis of Medical Data [57.19441629270029]
In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of survival analysis models.
In the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge.
We propose DPFed-post which adds a post-processing stage to the private federated learning scheme.
arXiv Detail & Related papers (2022-02-08T10:03:24Z) - FLOP: Federated Learning on Medical Datasets using Partial Networks [84.54663831520853]
COVID-19 Disease due to the novel coronavirus has caused a shortage of medical resources.
Different data-driven deep learning models have been developed to mitigate the diagnosis of COVID-19.
The data itself is still scarce due to patient privacy concerns.
We propose a simple yet effective algorithm, named textbfFederated textbfL textbfon Medical datasets using textbfPartial Networks (FLOP)
arXiv Detail & Related papers (2021-02-10T01:56:58Z) - Privacy-preserving Artificial Intelligence Techniques in Biomedicine [3.908261721108553]
Training an AI model on sensitive data raises concerns about the privacy of individual participants.
This paper provides a structured overview of advances in privacy-preserving AI techniques in biomedicine.
It places the most important state-of-the-art approaches within a unified taxonomy and discusses their strengths, limitations, and open problems.
arXiv Detail & Related papers (2020-07-22T18:35:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.