Generation from Noisy Examples
- URL: http://arxiv.org/abs/2501.04179v1
- Date: Tue, 07 Jan 2025 23:16:14 GMT
- Title: Generation from Noisy Examples
- Authors: Ananth Raman, Vinod Raman,
- Abstract summary: We show that generatability is largely unaffected by the presence of a finite number of noisy examples.<n>For finite and countable classes we show that generatability is largely unaffected by the presence of a finite number of noisy examples.
- Score: 2.3020018305241337
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We continue to study the learning-theoretic foundations of generation by extending the results from Kleinberg and Mullainathan [2024] and Li et al. [2024] to account for noisy example streams. In the noiseless setting of Kleinberg and Mullainathan [2024] and Li et al. [2024], an adversary picks a hypothesis from a binary hypothesis class and provides a generator with a sequence of its positive examples. The goal of the generator is to eventually output new, unseen positive examples. In the noisy setting, an adversary still picks a hypothesis and a sequence of its positive examples. But, before presenting the stream to the generator, the adversary inserts a finite number of negative examples. Unaware of which examples are noisy, the goal of the generator is to still eventually output new, unseen positive examples. In this paper, we provide necessary and sufficient conditions for when a binary hypothesis class can be noisily generatable. We provide such conditions with respect to various constraints on the number of distinct examples that need to be seen before perfect generation of positive examples. Interestingly, for finite and countable classes we show that generatability is largely unaffected by the presence of a finite number of noisy examples.
Related papers
- Representative Language Generation [4.601683217376771]
"representative generation" is extended to address diversity and bias concerns in generative models.<n>We demonstrate feasibility for countably infinite hypothesis classes and collections of groups under certain conditions.<n>Our findings provide a rigorous foundation for developing more diverse and representative generative models.
arXiv Detail & Related papers (2025-05-27T23:02:54Z) - Wide Two-Layer Networks can Learn from Adversarial Perturbations [27.368408524000778]
We theoretically explain the counterintuitive success of perturbation learning.
We prove that adversarial perturbations contain sufficient class-specific features for networks to generalize from them.
arXiv Detail & Related papers (2024-10-31T06:55:57Z) - Controllable Game Level Generation: Assessing the Effect of Negative Examples in GAN Models [3.2228025627337864]
Generative Adversarial Networks (GANs) are unsupervised models designed to learn and replicate a target distribution.
Conditional Generative Adversarial Networks (CGANs) extend vanilla GANs by conditioning both the generator and discriminator on some additional information.
Rumi-GANs leverage negative examples to enhance the generator's ability to learn positive examples.
arXiv Detail & Related papers (2024-10-30T15:18:26Z) - Generation through the lens of learning theory [18.355039522639565]
We study generation through the lens of statistical learning theory.<n>We call "uniform" and "non-uniform" generation, and provide a characterization of which hypothesis classes are uniformly and non-uniformly generatable.
arXiv Detail & Related papers (2024-10-17T16:14:49Z) - GAPX: Generalized Autoregressive Paraphrase-Identification X [24.331570697458954]
A major source of this performance drop comes from biases introduced by negative examples.
We introduce a perplexity based out-of-distribution metric that we show can effectively and automatically determine how much weight it should be given during inference.
arXiv Detail & Related papers (2022-10-05T01:23:52Z) - Centrality and Consistency: Two-Stage Clean Samples Identification for
Learning with Instance-Dependent Noisy Labels [87.48541631675889]
We propose a two-stage clean samples identification method.
First, we employ a class-level feature clustering procedure for the early identification of clean samples.
Second, for the remaining clean samples that are close to the ground truth class boundary, we propose a novel consistency-based classification method.
arXiv Detail & Related papers (2022-07-29T04:54:57Z) - Instance-wise Hard Negative Example Generation for Contrastive Learning
in Unpaired Image-to-Image Translation [102.99799162482283]
We present instance-wise hard Negative Example Generation for Contrastive learning in Unpaired image-to-image Translation (NEGCUT)
Specifically, we train a generator to produce negative examples online. The generator is novel from two perspectives: 1) it is instance-wise which means that the generated examples are based on the input image, and 2) it can generate hard negative examples since it is trained with an adversarial loss.
arXiv Detail & Related papers (2021-08-10T09:44:59Z) - Investigating the Role of Negatives in Contrastive Representation
Learning [59.30700308648194]
Noise contrastive learning is a popular technique for unsupervised representation learning.
We focus on disambiguating the role of one of these parameters: the number of negative examples.
We find that the results broadly agree with our theory, while our vision experiments are murkier with performance sometimes even being insensitive to the number of negatives.
arXiv Detail & Related papers (2021-06-18T06:44:16Z) - Contrastive Learning with Adversarial Perturbations for Conditional Text
Generation [49.055659008469284]
We propose a principled method to generate positive and negative samples for contrastive learning of seq2seq models.
Specifically, we generate negative examples by adding small perturbations to the input sequence to minimize its conditional likelihood.
We empirically show that our proposed method significantly improves the generalization of the seq2seq on three text generation tasks.
arXiv Detail & Related papers (2020-12-14T06:20:27Z) - Tracking disease outbreaks from sparse data with Bayesian inference [55.82986443159948]
The COVID-19 pandemic provides new motivation for estimating the empirical rate of transmission during an outbreak.
Standard methods struggle to accommodate the partial observability and sparse data common at finer scales.
We propose a Bayesian framework which accommodates partial observability in a principled manner.
arXiv Detail & Related papers (2020-09-12T20:37:33Z) - Counterexamples to the Low-Degree Conjecture [80.3668228845075]
A conjecture of Hopkins posits that for certain high-dimensional hypothesis testing problems, no-time algorithm can outperform so-called "simple statistics"
This conjecture formalizes the beliefs surrounding a line of recent work that seeks to understand statistical-versus-computational tradeoffs.
arXiv Detail & Related papers (2020-04-17T21:08:11Z) - Generating Natural Adversarial Hyperspectral examples with a modified
Wasserstein GAN [0.0]
We present a new method which is able to generate natural adversarial examples from the true data following the second paradigm.
We provide a proof of concept of our method by generating adversarial hyperspectral signatures on a remote sensing dataset.
arXiv Detail & Related papers (2020-01-27T07:32:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.