High-resolution semantically-consistent image-to-image translation
- URL: http://arxiv.org/abs/2209.06264v1
- Date: Tue, 13 Sep 2022 19:08:30 GMT
- Title: High-resolution semantically-consistent image-to-image translation
- Authors: Mikhail Sokolov (1), Christopher Henry (1), Joni Storie (1),
Christopher Storie (1), Victor Alhassan (2), Mathieu Turgeon-Pelchat (2) ((1)
University of Winnipeg, (2) Canada Centre for Mapping and Earth Observation,
Natural Resources Canada)
- Abstract summary: This paper proposes an unsupervised domain adaptation model that preserves semantic consistency and per-pixel quality for the images during the style-transferring phase.
The proposed model shows substantial performance gain compared to the SemI2I model and reaches similar results as the state-of-the-art CyCADA model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Deep learning has become one of remote sensing scientists' most efficient
computer vision tools in recent years. However, the lack of training labels for
the remote sensing datasets means that scientists need to solve the domain
adaptation problem to narrow the discrepancy between satellite image datasets.
As a result, image segmentation models that are then trained, could better
generalize and use an existing set of labels instead of requiring new ones.
This work proposes an unsupervised domain adaptation model that preserves
semantic consistency and per-pixel quality for the images during the
style-transferring phase. This paper's major contribution is proposing the
improved architecture of the SemI2I model, which significantly boosts the
proposed model's performance and makes it competitive with the state-of-the-art
CyCADA model. A second contribution is testing the CyCADA model on the remote
sensing multi-band datasets such as WorldView-2 and SPOT-6. The proposed model
preserves semantic consistency and per-pixel quality for the images during the
style-transferring phase. Thus, the semantic segmentation model, trained on the
adapted images, shows substantial performance gain compared to the SemI2I model
and reaches similar results as the state-of-the-art CyCADA model. The future
development of the proposed method could include ecological domain transfer,
{\em a priori} evaluation of dataset quality in terms of data distribution, or
exploration of the inner architecture of the domain adaptation model.
Related papers
- Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation [1.3654846342364308]
We present a methodology for conditional control of human shape and pose in pretrained text-to-image diffusion models.
Fine-tuning these diffusion models to adhere to new conditions requires large datasets and high-quality annotations.
We propose a domain-adaptation technique that maintains image quality by isolating synthetically trained conditional information.
arXiv Detail & Related papers (2024-11-07T14:02:41Z) - Semantic Segmentation for Real-World and Synthetic Vehicle's Forward-Facing Camera Images [0.8562182926816566]
This is the solution for semantic segmentation problem in both real-world and synthetic images from a vehicle s forward-facing camera.
We concentrate in building a robust model which performs well across various domains of different outdoor situations.
This paper studies the effectiveness of employing real-world and synthetic data to handle the domain adaptation in semantic segmentation problem.
arXiv Detail & Related papers (2024-07-07T17:28:45Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models [36.59260354292177]
Recent advancements in text-to-image generation have inspired researchers to generate datasets tailored for perception models using generative models.
We aim to fine-tune vision-language models to a specific classification model without access to any real images.
Despite the high fidelity of generated images, we observed a significant performance degradation when fine-tuning the model using the generated datasets.
arXiv Detail & Related papers (2024-06-08T10:43:49Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - UniDiff: Advancing Vision-Language Models with Generative and
Discriminative Learning [86.91893533388628]
This paper presents UniDiff, a unified multi-modal model that integrates image-text contrastive learning (ITC), text-conditioned image synthesis learning (IS), and reciprocal semantic consistency modeling (RSC)
UniDiff demonstrates versatility in both multi-modal understanding and generative tasks.
arXiv Detail & Related papers (2023-06-01T15:39:38Z) - Diversity is Definitely Needed: Improving Model-Agnostic Zero-shot
Classification via Stable Diffusion [22.237426507711362]
Model-Agnostic Zero-Shot Classification (MA-ZSC) refers to training non-specific classification architectures to classify real images without using any real images during training.
Recent research has demonstrated that generating synthetic training images using diffusion models provides a potential solution to address MA-ZSC.
We propose modifications to the text-to-image generation process using a pre-trained diffusion model to enhance diversity.
arXiv Detail & Related papers (2023-02-07T07:13:53Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - Transferring and Regularizing Prediction for Semantic Segmentation [115.88957139226966]
In this paper, we exploit the intrinsic properties of semantic segmentation to alleviate such problem for model transfer.
We present a Regularizer of Prediction Transfer (RPT) that imposes the intrinsic properties as constraints to regularize model transfer in an unsupervised fashion.
Extensive experiments are conducted to verify the proposal of RPT on the transfer of models trained on GTA5 and SYNTHIA (synthetic data) to Cityscapes dataset (urban street scenes)
arXiv Detail & Related papers (2020-06-11T16:19:41Z) - Multi-task pre-training of deep neural networks for digital pathology [8.74883469030132]
We first assemble and transform many digital pathology datasets into a pool of 22 classification tasks and almost 900k images.
We show that our models used as feature extractors either improve significantly over ImageNet pre-trained models or provide comparable performance.
arXiv Detail & Related papers (2020-05-05T08:50:17Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.