Learning Realistic Patterns from Unrealistic Stimuli: Generalization and
Data Anonymization
- URL: http://arxiv.org/abs/2009.10007v2
- Date: Thu, 9 Dec 2021 11:56:11 GMT
- Title: Learning Realistic Patterns from Unrealistic Stimuli: Generalization and
Data Anonymization
- Authors: Konstantinos Nikolaidis, Stein Kristiansen, Thomas Plagemann, Vera
Goebel, Knut Liest{\o}l, Mohan Kankanhalli, Gunn Marit Traaen, Britt
{\O}verland, Harriet Akre, Lars Aaker{\o}y, Sigurd Steinshamn
- Abstract summary: This work investigates a simple yet unconventional approach for anonymized data synthesis to enable third parties to benefit from such private data.
We use sleep monitoring data from both an open and a large closed clinical study and evaluate whether (1) end-users can create and successfully use customized classification models for sleep apnea detection, and (2) the identity of participants in the study is protected.
- Score: 0.5091527753265949
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Good training data is a prerequisite to develop useful ML applications.
However, in many domains existing data sets cannot be shared due to privacy
regulations (e.g., from medical studies). This work investigates a simple yet
unconventional approach for anonymized data synthesis to enable third parties
to benefit from such private data. We explore the feasibility of learning
implicitly from unrealistic, task-relevant stimuli, which are synthesized by
exciting the neurons of a trained deep neural network (DNN). As such, neuronal
excitation serves as a pseudo-generative model. The stimuli data is used to
train new classification models. Furthermore, we extend this framework to
inhibit representations that are associated with specific individuals. We use
sleep monitoring data from both an open and a large closed clinical study and
evaluate whether (1) end-users can create and successfully use customized
classification models for sleep apnea detection, and (2) the identity of
participants in the study is protected. Extensive comparative empirical
investigation shows that different algorithms trained on the stimuli are able
generalize successfully on the same task as the original model. However,
architectural and algorithmic similarity between new and original models play
an important role in performance. For similar architectures, the performance is
close to that of using the true data (e.g., Accuracy difference of 0.56\%,
Kappa coefficient difference of 0.03-0.04). Further experiments show that the
stimuli can to a large extent successfully anonymize participants of the
clinical studies.
Related papers
- Latent Variable Sequence Identification for Cognitive Models with Neural Bayes Estimation [7.7227297059345466]
We present an approach that extends neural Bayes estimation to learn a direct mapping between experimental data and the targeted latent variable space.
Our work underscores that combining recurrent neural networks and simulation-based inference to identify latent variable sequences can enable researchers to access a wider class of cognitive models.
arXiv Detail & Related papers (2024-06-20T21:13:39Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Improving the Level of Autism Discrimination through GraphRNN Link
Prediction [8.103074928419527]
This paper is based on the latter technique, which learns the edge distribution of real brain network through GraphRNN.
The experimental results show that the combination of original and synthetic data greatly improves the discrimination of the neural network.
arXiv Detail & Related papers (2022-02-19T06:50:32Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - Handling Data Heterogeneity with Generative Replay in Collaborative
Learning for Medical Imaging [21.53220262343254]
We present a novel generative replay strategy to address the challenge of data heterogeneity in collaborative learning methods.
A primary model learns the desired task, and an auxiliary "generative replay model" either synthesizes images that closely resemble the input images or helps extract latent variables.
The generative replay strategy is flexible to use, can either be incorporated into existing collaborative learning methods to improve their capability of handling data heterogeneity across institutions, or be used as a novel and individual collaborative learning framework (termed FedReplay) to reduce communication cost.
arXiv Detail & Related papers (2021-06-24T17:39:55Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Learning identifiable and interpretable latent models of
high-dimensional neural activity using pi-VAE [10.529943544385585]
We propose a method that integrates key ingredients from latent models and traditional neural encoding models.
Our method, pi-VAE, is inspired by recent progress on identifiable variational auto-encoder.
We validate pi-VAE using synthetic data, and apply it to analyze neurophysiological datasets from rat hippocampus and macaque motor cortex.
arXiv Detail & Related papers (2020-11-09T22:00:38Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Do Saliency Models Detect Odd-One-Out Targets? New Datasets and Evaluations [15.374430656911498]
We investigate singleton detection, which can be thought of as a canonical example of salience.
We show that nearly all saliency algorithms do not adequately respond to singleton targets in synthetic and natural images.
arXiv Detail & Related papers (2020-05-13T20:59:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.