Data privacy protection in microscopic image analysis for material data
mining
- URL: http://arxiv.org/abs/2111.07892v1
- Date: Tue, 9 Nov 2021 11:16:33 GMT
- Title: Data privacy protection in microscopic image analysis for material data
mining
- Authors: Boyuan Ma and Xiang Yin and Xiaojuan Ban and Haiyou Huang and Neng
Zhang and Hao Wang and Weihua Xue
- Abstract summary: In this study, a material microstructure image feature extraction algorithm FedTransfer based on data privacy protection is proposed.
The core contributions are as follows: 1) the federated learning algorithm is introduced into the polycrystalline microstructure image segmentation task to make full use of different user data to carry out machine learning, break the data island and improve the model generalization ability under the condition of ensuring the privacy and security of user data.
By sharing style information of images that is not urgent for user confidentiality, it can reduce the performance penalty caused by the distribution difference of data among different users.
- Score: 8.266759895003279
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent progress in material data mining has been driven by high-capacity
models trained on large datasets. However, collecting experimental data has
been extremely costly owing to the amount of human effort and expertise
required. Therefore, material researchers are often reluctant to easily
disclose their private data, which leads to the problem of data island, and it
is difficult to collect a large amount of data to train high-quality models. In
this study, a material microstructure image feature extraction algorithm
FedTransfer based on data privacy protection is proposed. The core
contributions are as follows: 1) the federated learning algorithm is introduced
into the polycrystalline microstructure image segmentation task to make full
use of different user data to carry out machine learning, break the data island
and improve the model generalization ability under the condition of ensuring
the privacy and security of user data; 2) A data sharing strategy based on
style transfer is proposed. By sharing style information of images that is not
urgent for user confidentiality, it can reduce the performance penalty caused
by the distribution difference of data among different users.
Related papers
- Enhancing User-Centric Privacy Protection: An Interactive Framework through Diffusion Models and Machine Unlearning [54.30994558765057]
The study pioneers a comprehensive privacy protection framework that safeguards image data privacy concurrently during data sharing and model publication.
We propose an interactive image privacy protection framework that utilizes generative machine learning models to modify image information at the attribute level.
Within this framework, we instantiate two modules: a differential privacy diffusion model for protecting attribute information in images and a feature unlearning algorithm for efficient updates of the trained model on the revised image dataset.
arXiv Detail & Related papers (2024-09-05T07:55:55Z) - Assessing the Impact of Image Dataset Features on Privacy-Preserving Machine Learning [1.3604778572442302]
This study identifies image dataset characteristics that affect the utility and vulnerability of private and non-private Convolutional Neural Network (CNN) models.
We find that imbalanced datasets increase vulnerability in minority classes, but DP mitigates this issue.
arXiv Detail & Related papers (2024-09-02T15:30:27Z) - Privacy-preserving datasets by capturing feature distributions with Conditional VAEs [0.11999555634662634]
Conditional Variational Autoencoders (CVAEs) trained on feature vectors extracted from large pre-trained vision foundation models.
Our method notably outperforms traditional approaches in both medical and natural image domains.
Results underscore the potential of generative models to significantly impact deep learning applications in data-scarce and privacy-sensitive environments.
arXiv Detail & Related papers (2024-08-01T15:26:24Z) - Privacy-Preserving Graph Machine Learning from Data to Computation: A
Survey [67.7834898542701]
We focus on reviewing privacy-preserving techniques of graph machine learning.
We first review methods for generating privacy-preserving graph data.
Then we describe methods for transmitting privacy-preserved information.
arXiv Detail & Related papers (2023-07-10T04:30:23Z) - Approximate, Adapt, Anonymize (3A): a Framework for Privacy Preserving
Training Data Release for Machine Learning [3.29354893777827]
We introduce a data release framework, 3A (Approximate, Adapt, Anonymize), to maximize data utility for machine learning.
We present experimental evidence showing minimal discrepancy between performance metrics of models trained on real versus privatized datasets.
arXiv Detail & Related papers (2023-07-04T18:37:11Z) - Towards Generalizable Data Protection With Transferable Unlearnable
Examples [50.628011208660645]
We present a novel, generalizable data protection method by generating transferable unlearnable examples.
To the best of our knowledge, this is the first solution that examines data privacy from the perspective of data distribution.
arXiv Detail & Related papers (2023-05-18T04:17:01Z) - ConfounderGAN: Protecting Image Data Privacy with Causal Confounder [85.6757153033139]
We propose ConfounderGAN, a generative adversarial network (GAN) that can make personal image data unlearnable to protect the data privacy of its owners.
Experiments are conducted in six image classification datasets, consisting of three natural object datasets and three medical datasets.
arXiv Detail & Related papers (2022-12-04T08:49:14Z) - Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge.
Existing private generative models are struggling with the utility of synthetic samples.
We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z) - Privacy Enhancing Machine Learning via Removal of Unwanted Dependencies [21.97951347784442]
This paper studies new variants of supervised and adversarial learning methods, which remove the sensitive information in the data before they are sent out for a particular application.
The explored methods optimize privacy preserving feature mappings and predictive models simultaneously in an end-to-end fashion.
Experimental results on mobile sensing and face datasets demonstrate that our models can successfully maintain the utility performances of predictive models while causing sensitive predictions to perform poorly.
arXiv Detail & Related papers (2020-07-30T19:55:10Z) - TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework
for Deep Learning with Anonymized Intermediate Representations [49.20701800683092]
We present TIPRDC, a task-independent privacy-respecting data crowdsourcing framework with anonymized intermediate representation.
The goal of this framework is to learn a feature extractor that can hide the privacy information from the intermediate representations; while maximally retaining the original information embedded in the raw data for the data collector to accomplish unknown learning tasks.
arXiv Detail & Related papers (2020-05-23T06:21:26Z) - Privacy-Preserving Image Classification in the Local Setting [17.375582978294105]
Local Differential Privacy (LDP) brings us a promising solution, which allows the data owners to randomly perturb their input to provide the plausible deniability of the data before releasing.
In this paper, we consider a two-party image classification problem, in which data owners hold the image and the untrustworthy data user would like to fit a machine learning model with these images as input.
We propose a supervised image feature extractor, DCAConv, which produces an image representation with scalable domain size.
arXiv Detail & Related papers (2020-02-09T01:25:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.