Related papers: CompleteMe: Reference-based Human Image Completion

CompleteMe: Reference-based Human Image Completion

URL: http://arxiv.org/abs/2504.20042v1
Date: Mon, 28 Apr 2025 17:59:56 GMT
Title: CompleteMe: Reference-based Human Image Completion
Authors: Yu-Ju Tsai, Brian Price, Qing Liu, Luis Figueroa, Daniil Pakhomov, Zhihong Ding, Scott Cohen, Ming-Hsuan Yang,
Abstract summary: We propose CompleteMe, a novel reference-based human image completion framework.<n>CompleteMe employs a dual U-Net architecture combined with a Region-focused Attention (RFA) Block.<n>Our proposed method achieves superior visual quality and semantic consistency compared to existing techniques.
Score: 52.93963237043788
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Recent methods for human image completion can reconstruct plausible body shapes but often fail to preserve unique details, such as specific clothing patterns or distinctive accessories, without explicit reference images. Even state-of-the-art reference-based inpainting approaches struggle to accurately capture and integrate fine-grained details from reference images. To address this limitation, we propose CompleteMe, a novel reference-based human image completion framework. CompleteMe employs a dual U-Net architecture combined with a Region-focused Attention (RFA) Block, which explicitly guides the model's attention toward relevant regions in reference images. This approach effectively captures fine details and ensures accurate semantic correspondence, significantly improving the fidelity and consistency of completed images. Additionally, we introduce a challenging benchmark specifically designed for evaluating reference-based human image completion tasks. Extensive experiments demonstrate that our proposed method achieves superior visual quality and semantic consistency compared to existing techniques. Project page: https://liagm.github.io/CompleteMe/

Related papers

Recovering Partially Corrupted Major Objects through Tri-modality Based Image Completion [13.846868357952419]
Diffusion models have become widely adopted in image completion tasks.<n>A persistent challenge arises when an object is partially obscured in the damaged region, yet its remaining parts are still visible in the background.<n>We propose supplementing text-based guidance with a novel visual aid: a casual sketch.<n>This sketch supplies critical structural cues, enabling the generative model to produce an object structure that seamlessly integrates with the existing background.
arXiv Detail & Related papers (2025-03-10T08:34:31Z)
Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis [65.7968515029306]
We propose a novel Coarse-to-Fine Latent Diffusion (CFLD) method for Pose-Guided Person Image Synthesis (PGPIS) A perception-refined decoder is designed to progressively refine a set of learnable queries and extract semantic understanding of person images as a coarse-grained prompt.
arXiv Detail & Related papers (2024-02-28T06:07:07Z)
ENTED: Enhanced Neural Texture Extraction and Distribution for Reference-based Blind Face Restoration [51.205673783866146]
We present ENTED, a new framework for blind face restoration that aims to restore high-quality and realistic portrait images. We utilize a texture extraction and distribution framework to transfer high-quality texture features between the degraded input and reference image. The StyleGAN-like architecture in our framework requires high-quality latent codes to generate realistic images.
arXiv Detail & Related papers (2024-01-13T04:54:59Z)
CoSeR: Bridging Image and Language for Cognitive Super-Resolution [74.24752388179992]
We introduce the Cognitive Super-Resolution (CoSeR) framework, empowering SR models with the capacity to comprehend low-resolution images. We achieve this by marrying image appearance and language understanding to generate a cognitive embedding. To further improve image fidelity, we propose a novel condition injection scheme called "All-in-Attention"
arXiv Detail & Related papers (2023-11-27T16:33:29Z)
Reference-Guided Texture and Structure Inference for Image Inpainting [25.775006005766222]
We build a benchmark dataset containing 10K pairs of input and reference images for reference-guided inpainting. We adopt an encoder-decoder structure to infer the texture and structure features of the input image. A feature alignment module is further designed to refine these features of the input image with the guidance of a reference image.
arXiv Detail & Related papers (2022-07-29T06:26:03Z)
Learning Intrinsic Images for Clothing [10.21096394185778]
In this paper, we focus on intrinsic image decomposition for clothing images. A more interpretable edge-aware metric and an annotation scheme is designed for the testing set. We show that our proposed model significantly reduce texture-copying artifacts while retaining surprisingly tiny details.
arXiv Detail & Related papers (2021-11-16T14:43:12Z)
RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network [19.017377597937617]
We study the compositional learning of images and texts for image retrieval. We introduce a novel method that combines the graph convolutional network (GCN) with existing composition methods.
arXiv Detail & Related papers (2021-04-07T09:41:52Z)
Image Inpainting Guided by Coherence Priors of Semantics and Textures [62.92586889409379]
We introduce coherence priors between the semantics and textures which make it possible to concentrate on completing separate textures in a semantic-wise manner. We also propose two coherence losses to constrain the consistency between the semantics and the inpainted image in terms of the overall structure and detailed textures.
arXiv Detail & Related papers (2020-12-15T02:59:37Z)
Guidance and Evaluation: Semantic-Aware Image Inpainting for Mixed Scenes [54.836331922449666]
We propose a Semantic Guidance and Evaluation Network (SGE-Net) to update the structural priors and the inpainted image. It utilizes semantic segmentation map as guidance in each scale of inpainting, under which location-dependent inferences are re-evaluated. Experiments on real-world images of mixed scenes demonstrated the superiority of our proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-15T17:49:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.