Related papers: DSF-GAN: DownStream Feedback Generative Adversarial Network

DSF-GAN: DownStream Feedback Generative Adversarial Network

URL: http://arxiv.org/abs/2403.18267v1
Date: Wed, 27 Mar 2024 05:41:50 GMT
Title: DSF-GAN: DownStream Feedback Generative Adversarial Network
Authors: Oriel Perets, Nadav Rappoport,
Abstract summary: We propose a novel architecture called the DownStream Feedback Generative Adversarial Network (DSF-GAN) DSF-GAN incorporates feedback from a downstream prediction model during training to augment the generator's loss function with valuable information. Our experiments demonstrate improved model performance when training on synthetic samples generated by DSF-GAN, compared to those generated by the same GAN architecture without feedback.
Score: 0.07083082555458872
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Utility and privacy are two crucial measurements of the quality of synthetic tabular data. While significant advancements have been made in privacy measures, generating synthetic samples with high utility remains challenging. To enhance the utility of synthetic samples, we propose a novel architecture called the DownStream Feedback Generative Adversarial Network (DSF-GAN). This approach incorporates feedback from a downstream prediction model during training to augment the generator's loss function with valuable information. Thus, DSF-GAN utilizes a downstream prediction task to enhance the utility of synthetic samples. To evaluate our method, we tested it using two popular datasets. Our experiments demonstrate improved model performance when training on synthetic samples generated by DSF-GAN, compared to those generated by the same GAN architecture without feedback. The evaluation was conducted on the same validation set comprising real samples. All code and datasets used in this research will be made openly available for ease of reproduction.

Related papers

An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval [51.10419281315848]
We conduct an empirical study to explore the potential of synthetic data for Text-Based Person Retrieval (TBPR) research. We propose an inter-class image generation pipeline, in which an automatic prompt construction strategy is introduced. We develop an intra-class image augmentation pipeline, in which the generative AI models are applied to further edit the images.
arXiv Detail & Related papers (2025-03-28T06:18:15Z)
Beyond Sample-Level Feedback: Using Reference-Level Feedback to Guide Data Synthesis [55.65459867300319]
LLMs demonstrate remarkable capabilities in following natural language instructions, largely due to instruction-tuning on high-quality datasets. Recent approaches incorporate feedback to improve data quality, but typically operate at the sample level, generating and applying feedback for each response individually. We propose Reference-Level Feedback, a novel methodology that instead collects feedback based on high-quality reference samples from carefully curated seed data.
arXiv Detail & Related papers (2025-02-06T21:29:00Z)
Synthetic Image Learning: Preserving Performance and Preventing Membership Inference Attacks [5.0243930429558885]
This paper introduces Knowledge Recycling (KR), a pipeline designed to optimise the generation and use of synthetic data for training downstream classifiers. At the heart of this pipeline is Generative Knowledge Distillation (GKD), the proposed technique that significantly improves the quality and usefulness of the information. The results show a significant reduction in the performance gap between models trained on real and synthetic data, with models based on synthetic data outperforming those trained on real data in some cases.
arXiv Detail & Related papers (2024-07-22T10:31:07Z)
Bt-GAN: Generating Fair Synthetic Healthdata via Bias-transforming Generative Adversarial Networks [3.3903891679981593]
We present Bias-transforming Generative Adversarial Networks (Bt-GAN), a GAN-based synthetic data generator specifically designed for the healthcare domain. Our results demonstrate that Bt-GAN achieves SOTA accuracy while significantly improving fairness and minimizing bias.
arXiv Detail & Related papers (2024-04-21T12:16:38Z)
UAV-Sim: NeRF-based Synthetic Data Generation for UAV-based Perception [62.71374902455154]
We leverage recent advancements in neural rendering to improve static and dynamic novelview UAV-based image rendering. We demonstrate a considerable performance boost when a state-of-the-art detection model is optimized primarily on hybrid sets of real and synthetic data.
arXiv Detail & Related papers (2023-10-25T00:20:37Z)
Unsupervised evaluation of GAN sample quality: Introducing the TTJac Score [5.1359892878090845]
"TTJac score" is proposed to measure the fidelity of individual synthesized images in a data-free manner. The experimental results of applying the proposed metric to StyleGAN 2 and StyleGAN 2 ADA models on FFHQ, AFHQ-Wild, LSUN-Cars, and LSUN-Horse datasets are presented.
arXiv Detail & Related papers (2023-08-31T19:55:50Z)
Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task. We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z)
Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models. In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z)
PhysioGAN: Training High Fidelity Generative Model for Physiological Sensor Readings [6.029263679246354]
We present PHYSIOGAN, a generative model to produce high fidelity synthetic physiological sensor data readings. We evaluate it against the state-of-the-art techniques using two different real-world datasets: ECG classification and activity recognition from motion sensors datasets.
arXiv Detail & Related papers (2022-04-25T07:38:43Z)
CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE) At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales. We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z)
Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data [125.7135706352493]
Generative adversarial networks (GANs) typically require ample data for training in order to synthesize high-fidelity images. Recent studies have shown that training GANs with limited data remains formidable due to discriminator overfitting. This paper introduces a novel strategy called Adaptive Pseudo Augmentation (APA) to encourage healthy competition between the generator and the discriminator.
arXiv Detail & Related papers (2021-11-12T18:13:45Z)
Reparameterized Sampling for Generative Adversarial Networks [71.30132908130581]
We propose REP-GAN, a novel sampling method that allows general dependent proposals by REizing the Markov chains into the latent space of the generator. Empirically, extensive experiments on synthetic and real datasets demonstrate that our REP-GAN largely improves the sample efficiency and obtains better sample quality simultaneously.
arXiv Detail & Related papers (2021-07-01T10:34:55Z)
Using generative adversarial networks to synthesize artificial financial datasets [2.376767664163658]
We propose to use GANs to synthesize artificial financial data for research and benchmarking purposes. We test this approach on three American Express datasets, and show that properly trained GANs can replicate these datasets with high fidelity.
arXiv Detail & Related papers (2020-02-06T14:25:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.