SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained
Object Insertion and Layout Control
- URL: http://arxiv.org/abs/2312.05039v1
- Date: Fri, 8 Dec 2023 13:38:22 GMT
- Title: SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained
Object Insertion and Layout Control
- Authors: Jaskirat Singh, Jianming Zhang, Qing Liu, Cameron Smith, Zhe Lin,
Liang Zheng
- Abstract summary: We introduce SmartMask, which allows novice users to create detailed masks for precise object insertion.
Our experiments demonstrate that SmartMask achieves superior object insertion quality.
Unlike prior works, the proposed approach can also be used even without user-mask guidance.
- Score: 37.25722058326221
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The field of generative image inpainting and object insertion has made
significant progress with the recent advent of latent diffusion models.
Utilizing a precise object mask can greatly enhance these applications.
However, due to the challenges users encounter in creating high-fidelity masks,
there is a tendency for these methods to rely on more coarse masks (e.g.,
bounding box) for these applications. This results in limited control and
compromised background content preservation. To overcome these limitations, we
introduce SmartMask, which allows any novice user to create detailed masks for
precise object insertion. Combined with a ControlNet-Inpaint model, our
experiments demonstrate that SmartMask achieves superior object insertion
quality, preserving the background content more effectively than previous
methods. Notably, unlike prior works the proposed approach can also be used
even without user-mask guidance, which allows it to perform mask-free object
insertion at diverse positions and scales. Furthermore, we find that when used
iteratively with a novel instruction-tuning based planning model, SmartMask can
be used to design detailed layouts from scratch. As compared with user-scribble
based layout design, we observe that SmartMask allows for better quality
outputs with layout-to-image generation methods. Project page is available at
https://smartmask-gen.github.io
Related papers
- ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders [53.3185750528969]
Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework.
We introduce a data-independent method, termed ColorMAE, which generates different binary mask patterns by filtering random noise.
We demonstrate our strategy's superiority in downstream tasks compared to random masking.
arXiv Detail & Related papers (2024-07-17T22:04:00Z) - Automatic Generation of Semantic Parts for Face Image Synthesis [7.728916126705043]
We describe a network architecture to address the problem of automatically manipulating or generating the shape of object classes in semantic segmentation masks.
Our proposed model allows embedding the mask class-wise into a latent space where each class embedding can be independently edited.
We report quantitative and qualitative results on the Celeb-MaskHQ dataset, which show our model can both faithfully reconstruct and modify a segmentation mask at the class level.
arXiv Detail & Related papers (2023-07-11T15:01:42Z) - Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data.
We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process.
In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z) - Towards Improved Input Masking for Convolutional Neural Networks [66.99060157800403]
We propose a new masking method for CNNs we call layer masking.
We show that our method is able to eliminate or minimize the influence of the mask shape or color on the output of the model.
We also demonstrate how the shape of the mask may leak information about the class, thus affecting estimates of model reliance on class-relevant features.
arXiv Detail & Related papers (2022-11-26T19:31:49Z) - BoxTeacher: Exploring High-Quality Pseudo Labels for Weakly Supervised
Instance Segmentation [33.64088504387974]
BoxTeacher is an efficient and end-to-end training framework for high-performance weakly supervised instance segmentation.
We present a mask-aware confidence score to estimate the quality of pseudo masks, and propose the noise-aware pixel loss and noise-reduced affinity loss to adaptively optimize the student with pseudo masks.
arXiv Detail & Related papers (2022-10-11T06:23:30Z) - Layered Depth Refinement with Mask Guidance [61.10654666344419]
We formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models.
Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask.
We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions.
arXiv Detail & Related papers (2022-06-07T06:42:44Z) - MaskMTL: Attribute prediction in masked facial images with deep
multitask learning [9.91045425400833]
This paper presents a deep Multi-Task Learning (MTL) approach to jointly estimate various heterogeneous attributes from a single masked facial image.
The proposed approach supersedes in performance to other competing techniques.
arXiv Detail & Related papers (2022-01-09T13:03:29Z) - Ternary Feature Masks: zero-forgetting for task-incremental learning [68.34518408920661]
We propose an approach without any forgetting to continual learning for the task-aware regime.
By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them.
Our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.
arXiv Detail & Related papers (2020-01-23T18:08:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.