GAN-Supervised Dense Visual Alignment
- URL: http://arxiv.org/abs/2112.05143v1
- Date: Thu, 9 Dec 2021 18:59:58 GMT
- Title: GAN-Supervised Dense Visual Alignment
- Authors: William Peebles, Jun-Yan Zhu, Richard Zhang, Antonio Torralba, Alexei
Efros, Eli Shechtman
- Abstract summary: We propose GAN-Supervised Learning, a framework for learning discriminative models and their GAN-generated training data jointly end-to-end.
Inspired by the classic Congealing method, our GANgealing algorithm trains a Spatial Transformer to map random samples from a GAN trained on unaligned data to a common, jointly-learned target mode.
- Score: 95.37027391102684
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose GAN-Supervised Learning, a framework for learning discriminative
models and their GAN-generated training data jointly end-to-end. We apply our
framework to the dense visual alignment problem. Inspired by the classic
Congealing method, our GANgealing algorithm trains a Spatial Transformer to map
random samples from a GAN trained on unaligned data to a common,
jointly-learned target mode. We show results on eight datasets, all of which
demonstrate our method successfully aligns complex data and discovers dense
correspondences. GANgealing significantly outperforms past self-supervised
correspondence algorithms and performs on-par with (and sometimes exceeds)
state-of-the-art supervised correspondence algorithms on several datasets --
without making use of any correspondence supervision or data augmentation and
despite being trained exclusively on GAN-generated data. For precise
correspondence, we improve upon state-of-the-art supervised methods by as much
as $3\times$. We show applications of our method for augmented reality, image
editing and automated pre-processing of image datasets for downstream GAN
training.
Related papers
- Decentralized Federated Learning with Gradient Tracking over Time-Varying Directed Networks [42.92231921732718]
We propose a consensus-based algorithm called DSGTm-TV.
It incorporates gradient tracking and heavy-ball momentum to optimize a global objective function.
Under DSGTm-TV, agents will update local model parameters and gradient estimates using information exchange with neighboring agents.
arXiv Detail & Related papers (2024-09-25T06:23:16Z) - SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.
We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.
We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Semi-Supervised Image Captioning by Adversarially Propagating Labeled
Data [95.0476489266988]
We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models.
Our proposed method trains a captioner to learn from a paired data and to progressively associate unpaired data.
Our extensive and comprehensive empirical results both on (1) image-based and (2) dense region-based captioning datasets followed by comprehensive analysis on the scarcely-paired dataset.
arXiv Detail & Related papers (2023-01-26T15:25:43Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Exploring Data Aggregation and Transformations to Generalize across
Visual Domains [0.0]
This thesis contributes to research on Domain Generalization (DG), Domain Adaptation (DA) and their variations.
We propose new frameworks for Domain Generalization and Domain Adaptation which make use of feature aggregation strategies and visual transformations.
We show how our proposed solutions outperform competitive state-of-the-art approaches in established DG and DA benchmarks.
arXiv Detail & Related papers (2021-08-20T14:58:14Z) - Sparse Signal Models for Data Augmentation in Deep Learning ATR [0.8999056386710496]
We propose a data augmentation approach to incorporate domain knowledge and improve the generalization power of a data-intensive learning algorithm.
We exploit the sparsity of the scattering centers in the spatial domain and the smoothly-varying structure of the scattering coefficients in the azimuthal domain to solve the ill-posed problem of over-parametrized model fitting.
arXiv Detail & Related papers (2020-12-16T21:46:33Z) - Online Descriptor Enhancement via Self-Labelling Triplets for Visual
Data Association [28.03285334702022]
We propose a self-supervised method for incrementally refining visual descriptors to improve performance in the task of object-level visual data association.
Our method optimize deep descriptor generators online, by continuously training a widely available image classification network pre-trained with domain-independent data.
We show that our approach surpasses other visual data-association methods applied to a tracking-by-detection task, and show that it provides better performance-gains when compared to other methods that attempt to adapt to observed information.
arXiv Detail & Related papers (2020-11-06T17:42:04Z) - Lessons Learned from the Training of GANs on Artificial Datasets [0.0]
Generative Adversarial Networks (GANs) have made great progress in synthesizing realistic images in recent years.
GANs are prone to underfitting or overfitting, making the analysis of them difficult and constrained.
We train them on artificial datasets where there are infinitely many samples and the real data distributions are simple.
We find that training mixtures of GANs leads to more performance gain compared to increasing the network depth or width.
arXiv Detail & Related papers (2020-07-13T14:51:02Z) - Gradient-Induced Co-Saliency Detection [81.54194063218216]
Co-saliency detection (Co-SOD) aims to segment the common salient foreground in a group of relevant images.
In this paper, inspired by human behavior, we propose a gradient-induced co-saliency detection method.
arXiv Detail & Related papers (2020-04-28T08:40:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.