Related papers: Beyond Imperfections: A Conditional Inpainting Approach for End-to-End Artifact Removal in VTON and Pose Transfer

Beyond Imperfections: A Conditional Inpainting Approach for End-to-End Artifact Removal in VTON and Pose Transfer

URL: http://arxiv.org/abs/2410.04052v1
Date: Sat, 5 Oct 2024 06:18:26 GMT
Title: Beyond Imperfections: A Conditional Inpainting Approach for End-to-End Artifact Removal in VTON and Pose Transfer
Authors: Aref Tabatabaei, Zahra Dehghanian, Maryam Amirmazlaghani,
Abstract summary: Artifacts often degrade the visual quality of virtual try-on (VTON) and pose transfer applications. This study introduces a novel conditional inpainting technique designed to detect and remove such distortions.
Score: 2.990411348977783
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Artifacts often degrade the visual quality of virtual try-on (VTON) and pose transfer applications, impacting user experience. This study introduces a novel conditional inpainting technique designed to detect and remove such distortions, improving image aesthetics. Our work is the first to present an end-to-end framework addressing this specific issue, and we developed a specialized dataset of artifacts in VTON and pose transfer tasks, complete with masks highlighting the affected areas. Experimental results show that our method not only effectively removes artifacts but also significantly enhances the visual quality of the final images, setting a new benchmark in computer vision and image processing.

Related papers

Image inpainting enhancement by replacing the original mask with a self-attended region from the input image [44.8450669068833]
We introduce a novel deep learning-based pre-processing methodology for image inpainting utilizing the Vision Transformer (ViT) Our approach involves replacing masked pixel values with those generated by the ViT, leveraging diverse visual patches within the attention matrix to capture discriminative spatial features.
arXiv Detail & Related papers (2024-11-08T17:04:05Z)
DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task. We first apply attention masking in each denoising step to make the generation more disentangled across different objects. In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z)
Joint Quality Assessment and Example-Guided Image Processing by Disentangling Picture Appearance from Content [30.939589712281684]
Deep learning has impacted low-level image processing tasks such as style/domain transfer, enhancement/restoration, and visual quality assessments. We leverage this observation to develop a novel disentangled representation learning method that decomposes inputs into content and appearance features. We demonstrate through extensive evaluations that DisQUE achieves accuracy across quality prediction tasks and distortion types.
arXiv Detail & Related papers (2024-04-20T23:02:57Z)
PRISM: Progressive Restoration for Scene Graph-based Image Manipulation [47.77003316561398]
PRISM is a novel multi-head image manipulation approach to improve the accuracy and quality of the manipulated regions in the scene. Our results demonstrate the potential of our approach for enhancing the quality and precision of scene graph-based image manipulation.
arXiv Detail & Related papers (2023-11-03T21:30:34Z)
Perceptual Artifacts Localization for Image Synthesis Tasks [59.638307505334076]
We introduce a novel dataset comprising 10,168 generated images, each annotated with per-pixel perceptual artifact labels. A segmentation model, trained on our proposed dataset, effectively localizes artifacts across a range of tasks. We propose an innovative zoom-in inpainting pipeline that seamlessly rectifies perceptual artifacts in the generated images.
arXiv Detail & Related papers (2023-10-09T10:22:08Z)
TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation [55.94900327396771]
We introduce neural texture learning for 6D object pose estimation from synthetic data. We learn to predict realistic texture of objects from real image collections. We learn pose estimation from pixel-perfect synthetic data.
arXiv Detail & Related papers (2022-12-25T13:36:32Z)
C-VTON: Context-Driven Image-Based Virtual Try-On Network [1.0832844764942349]
We propose a Context-Driven Virtual Try-On Network (C-VTON) that convincingly transfers selected clothing items to the target subjects. At the core of the C-VTON pipeline are: (i) a geometric matching procedure that efficiently aligns the target clothing with the pose of the person in the input images, and (ii) a powerful image generator that utilizes various types of contextual information when the final try-on result.
arXiv Detail & Related papers (2022-12-08T17:56:34Z)
Perceptual Artifacts Localization for Inpainting [60.5659086595901]
We propose a new learning task of automatic segmentation of inpainting perceptual artifacts. We train advanced segmentation networks on a dataset to reliably localize inpainting artifacts within inpainted images. We also propose a new evaluation metric called Perceptual Artifact Ratio (PAR), which is the ratio of objectionable inpainted regions to the entire inpainted area.
arXiv Detail & Related papers (2022-08-05T18:50:51Z)
End-to-End Visual Editing with a Generatively Pre-Trained Artist [78.5922562526874]
We consider the targeted image editing problem: blending a region in a source image with a driver image that specifies the desired change. We propose a self-supervised approach that simulates edits by augmenting off-the-shelf images in a target domain. We show that different blending effects can be learned by an intuitive control of the augmentation process, with no other changes required to the model architecture.
arXiv Detail & Related papers (2022-05-03T17:59:30Z)
Controllable Person Image Synthesis with Spatially-Adaptive Warped Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes. We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters. We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.