Related papers: Idempotent Generative Network

Idempotent Generative Network

URL: http://arxiv.org/abs/2311.01462v1
Date: Thu, 2 Nov 2023 17:59:55 GMT
Title: Idempotent Generative Network
Authors: Assaf Shocher, Amil Dravid, Yossi Gandelsman, Inbar Mosseri, Michael Rubinstein, Alexei A. Efros
Abstract summary: We propose a new approach for generative modeling based on training a neural network to be idempotent. An idempotent operator is one that can be applied sequentially without changing the result beyond the initial application. We find that by processing inputs from both target and source distributions, the model adeptly projects corrupted or modified data back to the target manifold.
Score: 61.78905138698094
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a new approach for generative modeling based on training a neural network to be idempotent. An idempotent operator is one that can be applied sequentially without changing the result beyond the initial application, namely $f(f(z))=f(z)$. The proposed model $f$ is trained to map a source distribution (e.g, Gaussian noise) to a target distribution (e.g. realistic images) using the following objectives: (1) Instances from the target distribution should map to themselves, namely $f(x)=x$. We define the target manifold as the set of all instances that $f$ maps to themselves. (2) Instances that form the source distribution should map onto the defined target manifold. This is achieved by optimizing the idempotence term, $f(f(z))=f(z)$ which encourages the range of $f(z)$ to be on the target manifold. Under ideal assumptions such a process provably converges to the target distribution. This strategy results in a model capable of generating an output in one step, maintaining a consistent latent space, while also allowing sequential applications for refinement. Additionally, we find that by processing inputs from both target and source distributions, the model adeptly projects corrupted or modified data back to the target manifold. This work is a first step towards a ``global projector'' that enables projecting any input into a target data distribution.

Related papers

Sample and Map from a Single Convex Potential: Generation using Conjugate Moment Measures [22.7776491836979]
canonical approach to split model fitting into two blocks: define first how to sample noise (e.g., Gaussian) and choose next what to do with it (e.g. using a single map or map)<n>We propose an alternative factorization: $nabla w*sharp e-w$ is the convex conjugate of a convex potential $w$, where $w*$ is the convex conjugate of a convex potential $w$.
arXiv Detail & Related papers (2025-03-13T17:28:44Z)
Computational-Statistical Tradeoffs at the Next-Token Prediction Barrier: Autoregressive and Imitation Learning under Misspecification [50.717692060500696]
Next-token prediction with the logarithmic loss is a cornerstone of autoregressive sequence modeling.<n>Next-token prediction can be made robust so as to achieve $C=tilde O(H)$, representing moderate error amplification.<n>No computationally efficient algorithm can achieve sub-polynomial approximation factor $C=e(log H)1-Omega(1)$.
arXiv Detail & Related papers (2025-02-18T02:52:00Z)
IT$^3$: Idempotent Test-Time Training [95.78053599609044]
This paper introduces Idempotent Test-Time Training (IT$3$), a novel approach to addressing the challenge of distribution shift. IT$3$ is based on the universal property of idempotence. We demonstrate the versatility of our approach across various tasks, including corrupted image classification.
arXiv Detail & Related papers (2024-10-05T15:39:51Z)
A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models [45.60426164657739]
We develop non-asymptotic convergence theory for a diffusion-based sampler. We prove that $d/varepsilon$ are sufficient to approximate the target distribution to within $varepsilon$ total-variation distance. Our results also characterize how $ell$ score estimation errors affect the quality of the data generation processes.
arXiv Detail & Related papers (2024-08-05T09:02:24Z)
Concentration Inequalities for $(f,Γ)$-GANs [5.022028859839544]
Generative adversarial networks (GANs) are unsupervised learning methods for training a generator distribution to produce samples that approximate those drawn from a target distribution. Recent works have proven the statistical consistency of GANs based on integral probability metrics (IPMs), e.g., WGAN which is based on the 1-Wasserstein metric. A much larger class of GANs, which allow for the use of nonlinear objective functionals, can be constructed using $(f,Gamma)$-divergences.
arXiv Detail & Related papers (2024-06-24T17:42:03Z)
Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.<n>We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.<n>Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z)
Delta-AI: Local objectives for amortized inference in sparse graphical models [64.5938437823851]
We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs) Our approach is based on the observation that when the sampling of variables in a PGM is seen as a sequence of actions taken by an agent, sparsity of the PGM enables local credit assignment in the agent's policy learning objective. We illustrate $Delta$-AI's effectiveness for sampling from synthetic PGMs and training latent variable models with sparse factor structure.
arXiv Detail & Related papers (2023-10-03T20:37:03Z)
Normalizing flow sampling with Langevin dynamics in the latent space [12.91637880428221]
Normalizing flows (NF) use a continuous generator to map a simple latent (e.g. Gaussian) distribution, towards an empirical target distribution associated with a training data set. Since standard NF implement differentiable maps, they may suffer from pathological behaviors when targeting complex distributions. This paper proposes a new Markov chain Monte Carlo algorithm to sample from the target distribution in the latent domain before transporting it back to the target domain.
arXiv Detail & Related papers (2023-05-20T09:31:35Z)
Multi-Task Imitation Learning for Linear Dynamical Systems [50.124394757116605]
We study representation learning for efficient imitation learning over linear systems. We find that the imitation gap over trajectories generated by the learned target policy is bounded by $tildeOleft( frack n_xHN_mathrmshared + frack n_uN_mathrmtargetright)$.
arXiv Detail & Related papers (2022-12-01T00:14:35Z)
Redistributor: Transforming Empirical Data Distributions [1.4936946857731088]
Redistributor forces a collection of scalar samples to follow a desired distribution. It produces a consistent estimator of the transformation $R$ which satisfies $R(S)=T$ in distribution. The package is implemented in Python and is optimized to efficiently handle large datasets.
arXiv Detail & Related papers (2022-10-25T17:59:03Z)
KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications. A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain. We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z)
On Generating Transferable Targeted Perturbations [102.3506210331038]
We propose a new generative approach for highly transferable targeted perturbations. Our approach matches the perturbed image distribution' with that of the target class, leading to high targeted transferability rates.
arXiv Detail & Related papers (2021-03-26T17:55:28Z)
Convergence and Sample Complexity of SGD in GANs [15.25030172685628]
We provide convergence guarantees on training Generative Adversarial Networks (GANs) via SGD. We consider learning a target distribution modeled by a 1-layer Generator network with a non-linear activation function. Our results apply to a broad class of non-linear activation functions $phi$, including ReLUs and is enabled by a connection with truncated statistics.
arXiv Detail & Related papers (2020-12-01T18:50:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.