Data from Model: Extracting Data from Non-robust and Robust Models
- URL: http://arxiv.org/abs/2007.06196v1
- Date: Mon, 13 Jul 2020 05:27:48 GMT
- Title: Data from Model: Extracting Data from Non-robust and Robust Models
- Authors: Philipp Benz, Chaoning Zhang, Tooba Imtiaz, In-So Kweon
- Abstract summary: This work explores the reverse process of generating data from a model, attempting to reveal the relationship between the data and the model.
We repeat the process of Data to Model (DtM) and Data from Model (DfM) in sequence and explore the loss of feature mapping information.
Our results show that the accuracy drop is limited even after multiple sequences of DtM and DfM, especially for robust models.
- Score: 83.60161052867534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The essence of deep learning is to exploit data to train a deep neural
network (DNN) model. This work explores the reverse process of generating data
from a model, attempting to reveal the relationship between the data and the
model. We repeat the process of Data to Model (DtM) and Data from Model (DfM)
in sequence and explore the loss of feature mapping information by measuring
the accuracy drop on the original validation dataset. We perform this
experiment for both a non-robust and robust origin model. Our results show that
the accuracy drop is limited even after multiple sequences of DtM and DfM,
especially for robust models. The success of this cycling transformation can be
attributed to the shared feature mapping existing in data and model. Using the
same data, we observe that different DtM processes result in models having
different features, especially for different network architecture families,
even though they achieve comparable performance.
Related papers
- When to Trust Your Data: Enhancing Dyna-Style Model-Based Reinforcement Learning With Data Filter [7.886307329450978]
Dyna-style algorithms combine two approaches by using simulated data from an estimated environmental model to accelerate model-free training.
Previous works address this issue by using model ensembles or pretraining the estimated model with data collected from the real environment.
We introduce an out-of-distribution data filter that removes simulated data from the estimated model that significantly diverges from data collected in the real environment.
arXiv Detail & Related papers (2024-10-16T01:49:03Z) - Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data [49.73114504515852]
We show that replacing the original real data by each generation's synthetic data does indeed tend towards model collapse.
We demonstrate that accumulating the successive generations of synthetic data alongside the original real data avoids model collapse.
arXiv Detail & Related papers (2024-04-01T18:31:24Z) - Modified CycleGAN for the synthesization of samples for wheat head
segmentation [0.09999629695552192]
In the absence of an annotated dataset, synthetic data can be used for model development.
We develop a realistic annotated synthetic dataset for wheat head segmentation.
The resulting model achieved a Dice score of 83.4% on an internal dataset and 83.6% on two external Global Wheat Head Detection datasets.
arXiv Detail & Related papers (2024-02-23T06:42:58Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Revisiting Permutation Symmetry for Merging Models between Different
Datasets [3.234560001579257]
We investigate the properties of merging models between different datasets.
We find that the accuracy of the merged model decreases more significantly as the datasets diverge more.
We show that condensed datasets created by dataset condensation can be used as substitutes for the original datasets.
arXiv Detail & Related papers (2023-06-09T03:00:34Z) - Private Gradient Estimation is Useful for Generative Modeling [25.777591229903596]
We present a new private generative modeling approach where samples are generated via Hamiltonian dynamics with gradients of the private dataset estimated by a well-trained network.
Our model is able to generate data with a resolution of 256x256.
arXiv Detail & Related papers (2023-05-18T02:51:17Z) - TRAK: Attributing Model Behavior at Scale [79.56020040993947]
We present TRAK (Tracing with Randomly-trained After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differenti models.
arXiv Detail & Related papers (2023-03-24T17:56:22Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.