Semantic-guided Multi-Mask Image Harmonization
- URL: http://arxiv.org/abs/2207.11722v1
- Date: Sun, 24 Jul 2022 11:48:49 GMT
- Title: Semantic-guided Multi-Mask Image Harmonization
- Authors: Xuqian Ren, Yifan Liu
- Abstract summary: We propose a new semantic-guided multi-mask image harmonization task.
In this work, we propose a novel way to edit the inharmonious images by predicting a series of operator masks.
- Score: 10.27974860479791
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous harmonization methods focus on adjusting one inharmonious region in
an image based on an input mask. They may face problems when dealing with
different perturbations on different semantic regions without available input
masks. To deal with the problem that one image has been pasted with several
foregrounds coming from different images and needs to harmonize them towards
different domain directions without any mask as input, we propose a new
semantic-guided multi-mask image harmonization task. Different from the
previous single-mask image harmonization task, each inharmonious image is
perturbed with different methods according to the semantic segmentation masks.
Two challenging benchmarks, HScene and HLIP, are constructed based on $150$ and
$19$ semantic classes, respectively. Furthermore, previous baselines focus on
regressing the exact value for each pixel of the harmonized images. The
generated results are in the `black box' and cannot be edited. In this work, we
propose a novel way to edit the inharmonious images by predicting a series of
operator masks. The masks indicate the level and the position to apply a
certain image editing operation, which could be the brightness, the saturation,
and the color in a specific dimension. The operator masks provide more
flexibility for users to edit the image further. Extensive experiments verify
that the operator mask-based network can further improve those state-of-the-art
methods which directly regress RGB images when the perturbations are
structural. Experiments have been conducted on our constructed benchmarks to
verify that our proposed operator mask-based framework can locate and modify
the inharmonious regions in more complex scenes. Our code and models are
available at
https://github.com/XuqianRen/Semantic-guided-Multi-mask-Image-Harmonization.git.
Related papers
- Semantic Image Synthesis with Unconditional Generator [8.65146533481257]
We propose to employ a pre-trained unconditional generator and rearrange its feature maps according to proxy masks.
The proxy masks are prepared from the feature maps of random samples in the generator by simple clustering.
Our method is versatile across various applications such as free-form spatial editing of real images, sketch-to-photo, and even scribble-to-photo.
arXiv Detail & Related papers (2024-02-22T09:10:28Z) - Variance-insensitive and Target-preserving Mask Refinement for
Interactive Image Segmentation [68.16510297109872]
Point-based interactive image segmentation can ease the burden of mask annotation in applications such as semantic segmentation and image editing.
We introduce a novel method, Variance-Insensitive and Target-Preserving Mask Refinement to enhance segmentation quality with fewer user inputs.
Experiments on GrabCut, Berkeley, SBD, and DAVIS datasets demonstrate our method's state-of-the-art performance in interactive image segmentation.
arXiv Detail & Related papers (2023-12-22T02:31:31Z) - Segment (Almost) Nothing: Prompt-Agnostic Adversarial Attacks on
Segmentation Models [61.46999584579775]
General purpose segmentation models are able to generate (semantic) segmentation masks from a variety of prompts.
In particular, input images are pre-processed by an image encoder to obtain embedding vectors which are later used for mask predictions.
We show that even imperceptible perturbations of radius $epsilon=1/255$ are often sufficient to drastically modify the masks predicted with point, box and text prompts.
arXiv Detail & Related papers (2023-11-24T12:57:34Z) - Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data.
We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process.
In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z) - Towards Improved Input Masking for Convolutional Neural Networks [66.99060157800403]
We propose a new masking method for CNNs we call layer masking.
We show that our method is able to eliminate or minimize the influence of the mask shape or color on the output of the model.
We also demonstrate how the shape of the mask may leak information about the class, thus affecting estimates of model reliance on class-relevant features.
arXiv Detail & Related papers (2022-11-26T19:31:49Z) - RePaint: Inpainting using Denoising Diffusion Probabilistic Models [161.74792336127345]
Free-form inpainting is the task of adding new content to an image in the regions specified by an arbitrary binary mask.
We propose RePaint: A Denoising Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks.
We validate our method for both faces and general-purpose image inpainting using standard and extreme masks.
arXiv Detail & Related papers (2022-01-24T18:40:15Z) - GANSeg: Learning to Segment by Unsupervised Hierarchical Image
Generation [16.900404701997502]
We propose a GAN-based approach that generates images conditioned on latent masks.
We show that such mask-conditioned image generation can be learned faithfully when conditioning the masks in a hierarchical manner.
It also lets us generate image-mask pairs for training a segmentation network, which outperforms the state-of-the-art unsupervised segmentation methods on established benchmarks.
arXiv Detail & Related papers (2021-12-02T07:57:56Z) - Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness [66.55719330810547]
Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.
We propose a novel mask-aware inpainting solution that learns multi-scale features for missing regions in the encoding phase.
Our framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets.
arXiv Detail & Related papers (2021-04-28T13:17:47Z) - Few-shot Semantic Image Synthesis Using StyleGAN Prior [8.528384027684192]
We present a training strategy that performs pseudo labeling of semantic masks using the StyleGAN prior.
Our key idea is to construct a simple mapping between the StyleGAN feature and each semantic class from a few examples of semantic masks.
Although the pseudo semantic masks might be too coarse for previous approaches that require pixel-aligned masks, our framework can synthesize high-quality images from not only dense semantic masks but also sparse inputs such as landmarks and scribbles.
arXiv Detail & Related papers (2021-03-27T11:04:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.