Related papers: Reducing Training Sample Memorization in GANs by Training with Memorization Rejection

Reducing Training Sample Memorization in GANs by Training with Memorization Rejection

URL: http://arxiv.org/abs/2210.12231v1
Date: Fri, 21 Oct 2022 20:17:50 GMT
Title: Reducing Training Sample Memorization in GANs by Training with Memorization Rejection
Authors: Andrew Bai, Cho-Jui Hsieh, Wendy Kan, Hsuan-Tien Lin
Abstract summary: We propose rejection memorization, a training scheme that rejects generated samples that are near-duplicates of training samples during training. Our scheme is simple, generic and can be directly applied to any GAN architecture.
Score: 80.0916819303573
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative adversarial network (GAN) continues to be a popular research direction due to its high generation quality. It is observed that many state-of-the-art GANs generate samples that are more similar to the training set than a holdout testing set from the same distribution, hinting some training samples are implicitly memorized in these models. This memorization behavior is unfavorable in many applications that demand the generated samples to be sufficiently distinct from known samples. Nevertheless, it is unclear whether it is possible to reduce memorization without compromising the generation quality. In this paper, we propose memorization rejection, a training scheme that rejects generated samples that are near-duplicates of training samples during training. Our scheme is simple, generic and can be directly applied to any GAN architecture. Experiments on multiple datasets and GAN models validate that memorization rejection effectively reduces training sample memorization, and in many cases does not sacrifice the generation quality. Code to reproduce the experiment results can be found at $\texttt{https://github.com/jybai/MRGAN}$.

Related papers

Redistribute Ensemble Training for Mitigating Memorization in Diffusion Models [31.92526915009259]
Diffusion models are known for their tremendous ability to generate high-quality samples. Recent methods for memory mitigation have primarily addressed the issue within the context of the text modality. We propose a novel method for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization.
arXiv Detail & Related papers (2025-02-13T15:56:44Z)
The Unreasonable Ineffectiveness of Nucleus Sampling on Mitigating Text Memorization [15.348047288817478]
We analyze the text memorization behavior of large language models (LLMs) when subjected to nucleus sampling. An increase of the nucleus size reduces memorization only modestly. Even when models do not engage in "hard" memorization, they may still display "soft" memorization.
arXiv Detail & Related papers (2024-08-29T08:30:33Z)
Detection of Under-represented Samples Using Dynamic Batch Training for Brain Tumor Segmentation from MR Images [0.8437187555622164]
Brain tumors in magnetic resonance imaging (MR) are difficult, time-consuming, and prone to human error. These challenges can be resolved by developing automatic brain tumor segmentation methods from MR images. Various deep-learning models based on the U-Net have been proposed for the task. These deep-learning models are trained on a dataset of tumor images and then used for segmenting the masks.
arXiv Detail & Related papers (2024-08-21T21:51:47Z)
How Low Can You Go? Surfacing Prototypical In-Distribution Samples for Unsupervised Anomaly Detection [48.30283806131551]
We show that UAD with extremely few training samples can already match -- and in some cases even surpass -- the performance of training with the whole training dataset. We propose an unsupervised method to reliably identify prototypical samples to further boost UAD performance.
arXiv Detail & Related papers (2023-12-06T15:30:47Z)
Forgetting Data from Pre-trained GANs [28.326418377665345]
We investigate how to post-edit a model after training so that it forgets certain kinds of samples. We provide three different algorithms for GANs that differ on how the samples to be forgotten are described. Our algorithms are capable of forgetting data while retaining high generation quality at a fraction of the cost of full re-training.
arXiv Detail & Related papers (2022-06-29T03:46:16Z)
ReSmooth: Detecting and Utilizing OOD Samples when Training with Data Augmentation [57.38418881020046]
Recent DA techniques always meet the need for diversity in augmented training samples. An augmentation strategy that has a high diversity usually introduces out-of-distribution (OOD) augmented samples. We propose ReSmooth, a framework that firstly detects OOD samples in augmented samples and then leverages them.
arXiv Detail & Related papers (2022-05-25T09:29:27Z)
ReMix: Towards Image-to-Image Translation with Limited Data [154.71724970593036]
We propose a data augmentation method (ReMix) to tackle this issue. We interpolate training samples at the feature level and propose a novel content loss based on the perceptual relations among samples. The proposed approach effectively reduces the ambiguity of generation and renders content-preserving results.
arXiv Detail & Related papers (2021-03-31T06:24:10Z)
One for More: Selecting Generalizable Samples for Generalizable ReID Model [92.40951770273972]
This paper proposes a one-for-more training objective that takes the generalization ability of selected samples as a loss function. Our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework.
arXiv Detail & Related papers (2020-12-10T06:37:09Z)
Instance Selection for GANs [25.196177369030146]
Generative Adversarial Networks (GANs) have led to their widespread adoption for the purposes of generating high quality synthetic imagery. GANs often produce unrealistic samples which fall outside of the data manifold. We propose a novel approach to improve sample quality: altering the training dataset via instance selection before model training has taken place.
arXiv Detail & Related papers (2020-07-30T06:33:51Z)
Automatic Recall Machines: Internal Replay, Continual Learning and the Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity. We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective. Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.