An Inpainting-Infused Pipeline for Attire and Background Replacement
- URL: http://arxiv.org/abs/2402.03501v1
- Date: Mon, 5 Feb 2024 20:34:32 GMT
- Title: An Inpainting-Infused Pipeline for Attire and Background Replacement
- Authors: Felipe Rodrigues Perche-Mahlow and Andr\'e Felipe-Zanella and William
Alberto Cruz-Casta\~neda and Marcellus Amadeus
- Abstract summary: We explore an integrated approach, leveraging advanced techniques in GenAI and computer vision emphasizing image manipulation.
The methodology unfolds through several stages, including depth estimation, the generation and replacement of backgrounds.
Experiments conducted in this study underscore the methodology's efficacy, highlighting its potential to produce visually captivating content.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, groundbreaking advancements in Generative Artificial
Intelligence (GenAI) have triggered a transformative paradigm shift,
significantly influencing various domains. In this work, we specifically
explore an integrated approach, leveraging advanced techniques in GenAI and
computer vision emphasizing image manipulation. The methodology unfolds through
several stages, including depth estimation, the creation of inpaint masks based
on depth information, the generation and replacement of backgrounds utilizing
Stable Diffusion in conjunction with Latent Consistency Models (LCMs), and the
subsequent replacement of clothes and application of aesthetic changes through
an inpainting pipeline. Experiments conducted in this study underscore the
methodology's efficacy, highlighting its potential to produce visually
captivating content. The convergence of these advanced techniques allows users
to input photographs of individuals and manipulate them to modify clothing and
background based on specific prompts without manually input inpainting masks,
effectively placing the subjects within the vast landscape of creative
imagination.
Related papers
- Image inpainting enhancement by replacing the original mask with a self-attended region from the input image [44.8450669068833]
We introduce a novel deep learning-based pre-processing methodology for image inpainting utilizing the Vision Transformer (ViT)
Our approach involves replacing masked pixel values with those generated by the ViT, leveraging diverse visual patches within the attention matrix to capture discriminative spatial features.
arXiv Detail & Related papers (2024-11-08T17:04:05Z) - TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization [59.412236435627094]
TALE is a training-free framework harnessing the generative capabilities of text-to-image diffusion models.
We equip TALE with two mechanisms dubbed Adaptive Latent Manipulation and Energy-guided Latent Optimization.
Our experiments demonstrate that TALE surpasses prior baselines and attains state-of-the-art performance in image-guided composition.
arXiv Detail & Related papers (2024-08-07T08:52:21Z) - Sketch-guided Image Inpainting with Partial Discrete Diffusion Process [5.005162730122933]
We introduce a novel partial discrete diffusion process (PDDP) for sketch-guided inpainting.
PDDP corrupts the masked regions of the image and reconstructs these masked regions conditioned on hand-drawn sketches.
The proposed novel transformer module accepts two inputs -- the image containing the masked region to be inpainted and the query sketch to model the reverse diffusion process.
arXiv Detail & Related papers (2024-04-18T07:07:38Z) - BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed
Dual-Branch Diffusion [61.90969199199739]
BrushNet is a novel plug-and-play dual-branch model engineered to embed pixel-level masked image features into any pre-trained DM.
BrushNet's superior performance over existing models across seven key metrics, including image quality, mask region preservation, and textual coherence.
arXiv Detail & Related papers (2024-03-11T17:59:31Z) - Disentangled Representation Learning for Controllable Person Image
Generation [29.719070087384512]
We propose a novel framework named DRL-CPG to learn disentangled latent representation for controllable person image generation.
To our knowledge, we are the first to learn disentangled latent representations with transformers for person image generation.
arXiv Detail & Related papers (2023-12-10T07:15:58Z) - Masking Improves Contrastive Self-Supervised Learning for ConvNets, and Saliency Tells You Where [63.61248884015162]
We aim to alleviate the burden of including masking operation into the contrastive-learning framework for convolutional neural networks.
We propose to explicitly take the saliency constraint into consideration in which the masked regions are more evenly distributed among the foreground and background.
arXiv Detail & Related papers (2023-09-22T09:58:38Z) - Deep Image Matting: A Comprehensive Survey [85.77905619102802]
This paper presents a review of recent advancements in image matting in the era of deep learning.
We focus on two fundamental sub-tasks: auxiliary input-based image matting and automatic image matting.
We discuss relevant applications of image matting and highlight existing challenges and potential opportunities for future research.
arXiv Detail & Related papers (2023-04-10T15:48:55Z) - Expanding the Latent Space of StyleGAN for Real Face Editing [4.1715767752637145]
A surge of face editing techniques have been proposed to employ the pretrained StyleGAN for semantic manipulation.
To successfully edit a real image, one must first convert the input image into StyleGAN's latent variables.
We present a method to expand the latent space of StyleGAN with additional content features to break down the trade-off between low-distortion and high-editability.
arXiv Detail & Related papers (2022-04-26T18:27:53Z) - MAT: Mask-Aware Transformer for Large Hole Image Inpainting [79.67039090195527]
We present a novel model for large hole inpainting, which unifies the merits of transformers and convolutions.
Experiments demonstrate the state-of-the-art performance of the new model on multiple benchmark datasets.
arXiv Detail & Related papers (2022-03-29T06:36:17Z) - Look here! A parametric learning based approach to redirect visual
attention [49.609412873346386]
We introduce an automatic method to make an image region more attention-capturing via subtle image edits.
Our model predicts a distinct set of global parametric transformations to be applied to the foreground and background image regions.
Our edits enable inference at interactive rates on any image size, and easily generalize to videos.
arXiv Detail & Related papers (2020-08-12T16:08:36Z) - Interactive Neural Style Transfer with Artists [6.130486652666935]
We present interactive painting processes in which a painter and various neural style transfer algorithms interact on a real canvas.
We gather a set of paired painting-pictures images and present a new evaluation methodology based on the predictivity of neural style transfer algorithms.
arXiv Detail & Related papers (2020-03-14T15:27:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.