Improving the quality of generative models through Smirnov
transformation
- URL: http://arxiv.org/abs/2110.15914v1
- Date: Fri, 29 Oct 2021 17:01:06 GMT
- Title: Improving the quality of generative models through Smirnov
transformation
- Authors: \'Angel Gonz\'alez-Prieto, Alberto Mozo, Sandra G\'omez-Canaval, Edgar
Talavera
- Abstract summary: We propose a novel activation function to be used as output of the generator agent.
It is based on the Smirnov probabilistic transformation and it is specifically designed to improve the quality of the generated data.
- Score: 1.3492000366723798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solving the convergence issues of Generative Adversarial Networks (GANs) is
one of the most outstanding problems in generative models. In this work, we
propose a novel activation function to be used as output of the generator
agent. This activation function is based on the Smirnov probabilistic
transformation and it is specifically designed to improve the quality of the
generated data. In sharp contrast with previous works, our activation function
provides a more general approach that deals not only with the replication of
categorical variables but with any type of data distribution (continuous or
discrete). Moreover, our activation function is derivable and therefore, it can
be seamlessly integrated in the backpropagation computations during the GAN
training processes. To validate this approach, we evaluate our proposal against
two different data sets: a) an artificially rendered data set containing a
mixture of discrete and continuous variables, and b) a real data set of
flow-based network traffic data containing both normal connections and
cryptomining attacks. To evaluate the fidelity of the generated data, we
analyze both their results in terms of quality measures of statistical nature
and also regarding the use of these synthetic data to feed a nested machine
learning-based classifier. The experimental results evince a clear
outperformance of the GAN network tuned with this new activation function with
respect to both a na\"ive mean-based generator and a standard GAN. The quality
of the data is so high that the generated data can fully substitute real data
for training the nested classifier without a fall in the obtained accuracy.
This result encourages the use of GANs to produce high-quality synthetic data
that are applicable in scenarios in which data privacy must be guaranteed.
Related papers
- An improved tabular data generator with VAE-GMM integration [9.4491536689161]
We propose a novel Variational Autoencoder (VAE)-based model that addresses limitations of current approaches.
Inspired by the TVAE model, our approach incorporates a Bayesian Gaussian Mixture model (BGM) within the VAE architecture.
We thoroughly validate our model on three real-world datasets with mixed data types, including two medically relevant ones.
arXiv Detail & Related papers (2024-04-12T12:31:06Z) - Fake It Till Make It: Federated Learning with Consensus-Oriented
Generation [52.82176415223988]
We propose federated learning with consensus-oriented generation (FedCOG)
FedCOG consists of two key components at the client side: complementary data generation and knowledge-distillation-based model training.
Experiments on classical and real-world FL datasets show that FedCOG consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T18:49:59Z) - SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.
We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.
We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy.
At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z) - Improving Out-of-Distribution Robustness of Classifiers via Generative
Interpolation [56.620403243640396]
Deep neural networks achieve superior performance for learning from independent and identically distributed (i.i.d.) data.
However, their performance deteriorates significantly when handling out-of-distribution (OoD) data.
We develop a simple yet effective method called Generative Interpolation to fuse generative models trained from multiple domains for synthesizing diverse OoD samples.
arXiv Detail & Related papers (2023-07-23T03:53:53Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - A Kernelised Stein Statistic for Assessing Implicit Generative Models [10.616967871198689]
We propose a principled procedure to assess the quality of a synthetic data generator.
The sample size from the synthetic data generator can be as large as desired, while the size of the observed data, which the generator aims to emulate is fixed.
arXiv Detail & Related papers (2022-05-31T23:40:21Z) - Improving Model Compatibility of Generative Adversarial Networks by
Boundary Calibration [24.28407308818025]
Boundary-Calibration GANs (BCGANs) are proposed to improve GAN's model compatibility.
BCGANs generate realistic images like original GANs but also achieves superior model compatibility than the original GANs.
arXiv Detail & Related papers (2021-11-03T16:08:09Z) - Copula Flows for Synthetic Data Generation [0.5801044612920815]
We propose to use a probabilistic model as a synthetic data generator.
We benchmark our method on both simulated and real data-sets in terms of density estimation.
arXiv Detail & Related papers (2021-01-03T10:06:23Z) - Partially Conditioned Generative Adversarial Networks [75.08725392017698]
Generative Adversarial Networks (GANs) let one synthesise artificial datasets by implicitly modelling the underlying probability distribution of a real-world training dataset.
With the introduction of Conditional GANs and their variants, these methods were extended to generating samples conditioned on ancillary information available for each sample within the dataset.
In this work, we argue that standard Conditional GANs are not suitable for such a task and propose a new Adversarial Network architecture and training strategy.
arXiv Detail & Related papers (2020-07-06T15:59:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.