Toward Zero-Shot Unsupervised Image-to-Image Translation
- URL: http://arxiv.org/abs/2007.14050v1
- Date: Tue, 28 Jul 2020 08:13:18 GMT
- Title: Toward Zero-Shot Unsupervised Image-to-Image Translation
- Authors: Yuanqi Chen, Xiaoming Yu, Shan Liu, Ge Li
- Abstract summary: We propose a zero-shot unsupervised image-to-image translation framework.
We introduce two strategies for exploiting the space spanned by the semantic attributes.
Our framework can be applied to many tasks, such as zero-shot classification and fashion design.
- Score: 34.51633300727676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies have shown remarkable success in unsupervised image-to-image
translation. However, if there has no access to enough images in target
classes, learning a mapping from source classes to the target classes always
suffers from mode collapse, which limits the application of the existing
methods. In this work, we propose a zero-shot unsupervised image-to-image
translation framework to address this limitation, by associating categories
with their side information like attributes. To generalize the translator to
previous unseen classes, we introduce two strategies for exploiting the space
spanned by the semantic attributes. Specifically, we propose to preserve
semantic relations to the visual space and expand attribute space by utilizing
attribute vectors of unseen classes, thus encourage the translator to explore
the modes of unseen classes. Quantitative and qualitative results on different
datasets demonstrate the effectiveness of our proposed approach. Moreover, we
demonstrate that our framework can be applied to many tasks, such as zero-shot
classification and fashion design.
Related papers
- StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation [18.213286385769525]
CycleGAN-based methods are known to hide the mismatched information in the generated images to bypass cycle consistency objectives.
We introduce StegoGAN, a novel model that leverages steganography to prevent spurious features in generated images.
Our approach enhances the semantic consistency of the translated images without requiring additional postprocessing or supervision.
arXiv Detail & Related papers (2024-03-29T12:23:58Z) - DualCoOp++: Fast and Effective Adaptation to Multi-Label Recognition
with Limited Annotations [79.433122872973]
Multi-label image recognition in the low-label regime is a task of great challenge and practical significance.
We leverage the powerful alignment between textual and visual features pretrained with millions of auxiliary image-text pairs.
We introduce an efficient and effective framework called Evidence-guided Dual Context Optimization (DualCoOp++)
arXiv Detail & Related papers (2023-08-03T17:33:20Z) - I2DFormer: Learning Image to Document Attention for Zero-Shot Image
Classification [123.90912800376039]
Online textual documents, e.g., Wikipedia, contain rich visual descriptions about object classes.
We propose I2DFormer, a novel transformer-based ZSL framework that jointly learns to encode images and documents.
Our method leads to highly interpretable results where document words can be grounded in the image regions.
arXiv Detail & Related papers (2022-09-21T12:18:31Z) - A Style-aware Discriminator for Controllable Image Translation [10.338078700632423]
Current image-to-image translations do not control the output domain beyond the classes used during training.
We propose a style-aware discriminator that acts as a critic as well as a style to provide conditions.
Experiments on multiple datasets verify that the proposed model outperforms current state-of-the-art image-to-image translation methods.
arXiv Detail & Related papers (2022-03-29T09:13:33Z) - VGSE: Visually-Grounded Semantic Embeddings for Zero-Shot Learning [113.50220968583353]
We propose to discover semantic embeddings containing discriminative visual properties for zero-shot learning.
Our model visually divides a set of images from seen classes into clusters of local image regions according to their visual similarity.
We demonstrate that our visually-grounded semantic embeddings further improve performance over word embeddings across various ZSL models by a large margin.
arXiv Detail & Related papers (2022-03-20T03:49:02Z) - Region-level Active Learning for Cluttered Scenes [60.93811392293329]
We introduce a new strategy that subsumes previous Image-level and Object-level approaches into a generalized, Region-level approach.
We show that this approach significantly decreases labeling effort and improves rare object search on realistic data with inherent class-imbalance and cluttered scenes.
arXiv Detail & Related papers (2021-08-20T14:02:38Z) - Improving Few-shot Learning with Weakly-supervised Object Localization [24.3569501375842]
We propose a novel framework that generates class representations by extracting features from class-relevant regions of the images.
Our method outperforms the baseline few-shot model in miniImageNet and tieredImageNet benchmarks.
arXiv Detail & Related papers (2021-05-25T07:39:32Z) - Contrastive Learning for Unsupervised Image-to-Image Translation [10.091669091440396]
We propose an unsupervised image-to-image translation method based on contrastive learning.
We randomly sample a pair of images and train the generator to change the appearance of one towards another while keeping the original structure.
Experimental results show that our method outperforms the leading unsupervised baselines in terms of visual quality and translation accuracy.
arXiv Detail & Related papers (2021-05-07T08:43:38Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.