When ControlNet Meets Inexplicit Masks: A Case Study of ControlNet on
its Contour-following Ability
- URL: http://arxiv.org/abs/2403.00467v1
- Date: Fri, 1 Mar 2024 11:45:29 GMT
- Title: When ControlNet Meets Inexplicit Masks: A Case Study of ControlNet on
its Contour-following Ability
- Authors: Wenjie Xuan, Yufei Xu, Shanshan Zhao, Chaoyue Wang, Juhua Liu, Bo Du,
Dacheng Tao
- Abstract summary: ControlNet excels at creating content that closely matches precise contours in user-provided masks.
When these masks contain noise, as a frequent occurrence with non-expert users, the output would include unwanted artifacts.
This paper first highlights the crucial role of controlling the impact of these inexplicit masks with diverse deterioration levels through in-depth analysis.
To enhance controllability with inexplicit masks, an advanced Shape-aware ControlNet consisting of a deterioration estimator and a shape-prior modulation block is devised.
- Score: 97.82197656469972
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: ControlNet excels at creating content that closely matches precise contours
in user-provided masks. However, when these masks contain noise, as a frequent
occurrence with non-expert users, the output would include unwanted artifacts.
This paper first highlights the crucial role of controlling the impact of these
inexplicit masks with diverse deterioration levels through in-depth analysis.
Subsequently, to enhance controllability with inexplicit masks, an advanced
Shape-aware ControlNet consisting of a deterioration estimator and a
shape-prior modulation block is devised. The deterioration estimator assesses
the deterioration factor of the provided masks. Then this factor is utilized in
the modulation block to adaptively modulate the model's contour-following
ability, which helps it dismiss the noise part in the inexplicit masks.
Extensive experiments prove its effectiveness in encouraging ControlNet to
interpret inaccurate spatial conditions robustly rather than blindly following
the given contours. We showcase application scenarios like modifying shape
priors and composable shape-controllable generation. Codes are soon available.
Related papers
- ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders [53.3185750528969]
Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework.
We introduce a data-independent method, termed ColorMAE, which generates different binary mask patterns by filtering random noise.
We demonstrate our strategy's superiority in downstream tasks compared to random masking.
arXiv Detail & Related papers (2024-07-17T22:04:00Z) - Enhancing Prompt Following with Visual Control Through Training-Free Mask-Guided Diffusion [27.61734719689046]
We propose a training-free approach named Mask-guided Prompt Following (MGPF) to enhance prompt following with visual control.
The efficacy and superiority of MGPF are validated through comprehensive quantitative and qualitative experiments.
arXiv Detail & Related papers (2024-04-23T06:10:43Z) - Masking Improves Contrastive Self-Supervised Learning for ConvNets, and Saliency Tells You Where [63.61248884015162]
We aim to alleviate the burden of including masking operation into the contrastive-learning framework for convolutional neural networks.
We propose to explicitly take the saliency constraint into consideration in which the masked regions are more evenly distributed among the foreground and background.
arXiv Detail & Related papers (2023-09-22T09:58:38Z) - Towards Improved Input Masking for Convolutional Neural Networks [66.99060157800403]
We propose a new masking method for CNNs we call layer masking.
We show that our method is able to eliminate or minimize the influence of the mask shape or color on the output of the model.
We also demonstrate how the shape of the mask may leak information about the class, thus affecting estimates of model reliance on class-relevant features.
arXiv Detail & Related papers (2022-11-26T19:31:49Z) - Real-Time Mask Detection Based on SSD-MobileNetV2 [2.538209532048867]
An excellent automatic real-time mask detection system can reduce a lot of work pressure for relevant staff.
Existing mask detection approaches are resource-intensive and do not achieve a good balance between speed and accuracy.
In this paper, we propose a new architecture for mask detection.
arXiv Detail & Related papers (2022-08-29T01:59:22Z) - Calibrated Hyperspectral Image Reconstruction via Graph-based
Self-Tuning Network [40.71031760929464]
Hyperspectral imaging (HSI) has attracted increasing research attention, especially for the ones based on a coded snapshot spectral imaging (CASSI) system.
Existing deep HSI reconstruction models are generally trained on paired data to retrieve original signals upon 2D compressed measurements given by a particular optical hardware mask in CASSI.
This mask-specific training style will lead to a hardware miscalibration issue, which sets up barriers to deploying deep HSI models among different hardware and noisy environments.
We propose a novel Graph-based Self-Tuning ( GST) network to reason uncertainties adapting to varying spatial structures of masks among
arXiv Detail & Related papers (2021-12-31T09:39:13Z) - Mask or Non-Mask? Robust Face Mask Detector via Triplet-Consistency
Representation Learning [23.062034116854875]
In the absence of vaccines or medicines to stop COVID-19, one of the effective methods to slow the spread of the coronavirus is to wear a face mask.
To mandate the use of face masks or coverings in public areas, additional human resources are required, which is tedious and attention-intensive.
We propose a face mask detection framework that uses the context attention module to enable the effective attention of the feed-forward convolution neural network.
arXiv Detail & Related papers (2021-10-01T16:44:06Z) - Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness [66.55719330810547]
Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.
We propose a novel mask-aware inpainting solution that learns multi-scale features for missing regions in the encoding phase.
Our framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets.
arXiv Detail & Related papers (2021-04-28T13:17:47Z) - MagGAN: High-Resolution Face Attribute Editing with Mask-Guided
Generative Adversarial Network [145.4591079418917]
MagGAN learns to only edit the facial parts that are relevant to the desired attribute changes.
A novel mask-guided conditioning strategy is introduced to incorporate the influence region of each attribute change into the generator.
A multi-level patch-wise discriminator structure is proposed to scale our model for high-resolution ($1024 times 1024$) face editing.
arXiv Detail & Related papers (2020-10-03T20:56:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.