Few-shot Semantic Image Synthesis with Class Affinity Transfer
- URL: http://arxiv.org/abs/2304.02321v1
- Date: Wed, 5 Apr 2023 09:24:45 GMT
- Title: Few-shot Semantic Image Synthesis with Class Affinity Transfer
- Authors: Marl\`ene Careil, Jakob Verbeek, St\'ephane Lathuili\`ere
- Abstract summary: We propose a transfer method that leverages a model trained on a large source dataset to improve the learning ability on small target datasets.
The class affinity matrix is introduced as a first layer to the source model to make it compatible with the target label maps.
We apply our approach to GAN-based and diffusion-based architectures for semantic synthesis.
- Score: 23.471210664024067
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Semantic image synthesis aims to generate photo realistic images given a
semantic segmentation map. Despite much recent progress, training them still
requires large datasets of images annotated with per-pixel label maps that are
extremely tedious to obtain. To alleviate the high annotation cost, we propose
a transfer method that leverages a model trained on a large source dataset to
improve the learning ability on small target datasets via estimated pairwise
relations between source and target classes. The class affinity matrix is
introduced as a first layer to the source model to make it compatible with the
target label maps, and the source model is then further finetuned for the
target domain. To estimate the class affinities we consider different
approaches to leverage prior knowledge: semantic segmentation on the source
domain, textual label embeddings, and self-supervised vision features. We apply
our approach to GAN-based and diffusion-based architectures for semantic
synthesis. Our experiments show that the different ways to estimate class
affinity can be effectively combined, and that our approach significantly
improves over existing state-of-the-art transfer approaches for generative
image models.
Related papers
- Enhanced Generative Data Augmentation for Semantic Segmentation via Stronger Guidance [1.2923961938782627]
We introduce an effective data augmentation method for semantic segmentation using the Controllable Diffusion Model.
Our proposed method includes efficient prompt generation using Class-Prompt Appending and Visual Prior Combination.
We evaluate our method on the PASCAL VOC datasets and found it highly effective for synthesizing images in semantic segmentation.
arXiv Detail & Related papers (2024-09-09T19:01:14Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - Diversified in-domain synthesis with efficient fine-tuning for few-shot
classification [64.86872227580866]
Few-shot image classification aims to learn an image classifier using only a small set of labeled examples per class.
We propose DISEF, a novel approach which addresses the generalization challenge in few-shot learning using synthetic data.
We validate our method in ten different benchmarks, consistently outperforming baselines and establishing a new state-of-the-art for few-shot classification.
arXiv Detail & Related papers (2023-12-05T17:18:09Z) - Adapt Anything: Tailor Any Image Classifiers across Domains And
Categories Using Text-to-Image Diffusion Models [82.95591765009105]
We aim to study if a modern text-to-image diffusion model can tailor any task-adaptive image classifier across domains and categories.
We utilize only one off-the-shelf text-to-image model to synthesize images with category labels derived from the corresponding text prompts.
arXiv Detail & Related papers (2023-10-25T11:58:14Z) - Controllable Multi-domain Semantic Artwork Synthesis [17.536225601718687]
We propose a dataset that contains 40,000 images of artwork from 4 different domains with their corresponding semantic label maps.
We generate the dataset by first extracting semantic maps from landscape photography.
We then propose a conditional Generative Adrial Network (GAN)-based approach to generate high-quality artwork.
arXiv Detail & Related papers (2023-08-19T21:16:28Z) - Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic
Image Synthesis [139.2216271759332]
We propose a novel ECGAN for the challenging semantic image synthesis task.
The semantic labels do not provide detailed structural information, making it challenging to synthesize local details and structures.
The widely adopted CNN operations such as convolution, down-sampling, and normalization usually cause spatial resolution loss.
We propose a novel contrastive learning method, which aims to enforce pixel embeddings belonging to the same semantic class to generate more similar image content.
arXiv Detail & Related papers (2023-07-22T14:17:19Z) - Self-Supervised Generative Style Transfer for One-Shot Medical Image
Segmentation [10.634870214944055]
In medical image segmentation, supervised deep networks' success comes at the cost of requiring abundant labeled data.
We propose a novel volumetric self-supervised learning for data augmentation capable of synthesizing volumetric image-segmentation pairs.
Our work's central tenet benefits from a combined view of one-shot generative learning and the proposed self-supervised training strategy.
arXiv Detail & Related papers (2021-10-05T15:28:42Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - Improving Augmentation and Evaluation Schemes for Semantic Image
Synthesis [16.097324852253912]
We introduce a novel augmentation scheme designed specifically for generative adversarial networks (GANs)
We propose to randomly warp object shapes in the semantic label maps used as an input to the generator.
The local shape discrepancies between the warped and non-warped label maps and images enable the GAN to learn better the structural and geometric details of the scene.
arXiv Detail & Related papers (2020-11-25T10:55:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.