Reducing Training Sample Memorization in GANs by Training with
Memorization Rejection
- URL: http://arxiv.org/abs/2210.12231v1
- Date: Fri, 21 Oct 2022 20:17:50 GMT
- Title: Reducing Training Sample Memorization in GANs by Training with
Memorization Rejection
- Authors: Andrew Bai, Cho-Jui Hsieh, Wendy Kan, Hsuan-Tien Lin
- Abstract summary: We propose rejection memorization, a training scheme that rejects generated samples that are near-duplicates of training samples during training.
Our scheme is simple, generic and can be directly applied to any GAN architecture.
- Score: 80.0916819303573
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative adversarial network (GAN) continues to be a popular research
direction due to its high generation quality. It is observed that many
state-of-the-art GANs generate samples that are more similar to the training
set than a holdout testing set from the same distribution, hinting some
training samples are implicitly memorized in these models. This memorization
behavior is unfavorable in many applications that demand the generated samples
to be sufficiently distinct from known samples. Nevertheless, it is unclear
whether it is possible to reduce memorization without compromising the
generation quality. In this paper, we propose memorization rejection, a
training scheme that rejects generated samples that are near-duplicates of
training samples during training. Our scheme is simple, generic and can be
directly applied to any GAN architecture. Experiments on multiple datasets and
GAN models validate that memorization rejection effectively reduces training
sample memorization, and in many cases does not sacrifice the generation
quality. Code to reproduce the experiment results can be found at
$\texttt{https://github.com/jybai/MRGAN}$.
Related papers
- The Unreasonable Ineffectiveness of Nucleus Sampling on Mitigating Text Memorization [15.348047288817478]
We analyze the text memorization behavior of large language models (LLMs) when subjected to nucleus sampling.
An increase of the nucleus size reduces memorization only modestly.
Even when models do not engage in "hard" memorization, they may still display "soft" memorization.
arXiv Detail & Related papers (2024-08-29T08:30:33Z) - Detection of Under-represented Samples Using Dynamic Batch Training for Brain Tumor Segmentation from MR Images [0.8437187555622164]
Brain tumors in magnetic resonance imaging (MR) are difficult, time-consuming, and prone to human error.
These challenges can be resolved by developing automatic brain tumor segmentation methods from MR images.
Various deep-learning models based on the U-Net have been proposed for the task.
These deep-learning models are trained on a dataset of tumor images and then used for segmenting the masks.
arXiv Detail & Related papers (2024-08-21T21:51:47Z) - How Low Can You Go? Surfacing Prototypical In-Distribution Samples for Unsupervised Anomaly Detection [48.30283806131551]
We show that UAD with extremely few training samples can already match -- and in some cases even surpass -- the performance of training with the whole training dataset.
We propose an unsupervised method to reliably identify prototypical samples to further boost UAD performance.
arXiv Detail & Related papers (2023-12-06T15:30:47Z) - Forgetting Data from Pre-trained GANs [28.326418377665345]
We investigate how to post-edit a model after training so that it forgets certain kinds of samples.
We provide three different algorithms for GANs that differ on how the samples to be forgotten are described.
Our algorithms are capable of forgetting data while retaining high generation quality at a fraction of the cost of full re-training.
arXiv Detail & Related papers (2022-06-29T03:46:16Z) - ReSmooth: Detecting and Utilizing OOD Samples when Training with Data
Augmentation [57.38418881020046]
Recent DA techniques always meet the need for diversity in augmented training samples.
An augmentation strategy that has a high diversity usually introduces out-of-distribution (OOD) augmented samples.
We propose ReSmooth, a framework that firstly detects OOD samples in augmented samples and then leverages them.
arXiv Detail & Related papers (2022-05-25T09:29:27Z) - ReMix: Towards Image-to-Image Translation with Limited Data [154.71724970593036]
We propose a data augmentation method (ReMix) to tackle this issue.
We interpolate training samples at the feature level and propose a novel content loss based on the perceptual relations among samples.
The proposed approach effectively reduces the ambiguity of generation and renders content-preserving results.
arXiv Detail & Related papers (2021-03-31T06:24:10Z) - One for More: Selecting Generalizable Samples for Generalizable ReID
Model [92.40951770273972]
This paper proposes a one-for-more training objective that takes the generalization ability of selected samples as a loss function.
Our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework.
arXiv Detail & Related papers (2020-12-10T06:37:09Z) - Instance Selection for GANs [25.196177369030146]
Generative Adversarial Networks (GANs) have led to their widespread adoption for the purposes of generating high quality synthetic imagery.
GANs often produce unrealistic samples which fall outside of the data manifold.
We propose a novel approach to improve sample quality: altering the training dataset via instance selection before model training has taken place.
arXiv Detail & Related papers (2020-07-30T06:33:51Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.