Related papers: Fashion Style Editing with Generative Human Prior

Fashion Style Editing with Generative Human Prior

URL: http://arxiv.org/abs/2404.01984v1
Date: Tue, 2 Apr 2024 14:22:04 GMT
Title: Fashion Style Editing with Generative Human Prior
Authors: Chaerin Kong, Seungyong Lee, Soohyeok Im, Wonsuk Yang,
Abstract summary: In this work, we aim to manipulate the fashion style of human imagery using text descriptions. Specifically, we leverage a generative human prior and achieve fashion style editing by navigating its learned latent space. Our framework successfully projects abstract fashion concepts onto human images and introduces exciting new applications to the field.
Score: 9.854813629782681
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Image editing has been a long-standing challenge in the research community with its far-reaching impact on numerous applications. Recently, text-driven methods started to deliver promising results in domains like human faces, but their applications to more complex domains have been relatively limited. In this work, we explore the task of fashion style editing, where we aim to manipulate the fashion style of human imagery using text descriptions. Specifically, we leverage a generative human prior and achieve fashion style editing by navigating its learned latent space. We first verify that the existing text-driven editing methods fall short for our problem due to their overly simplified guidance signal, and propose two directions to reinforce the guidance: textual augmentation and visual referencing. Combined with our empirical findings on the latent space structure, our Fashion Style Editing framework (FaSE) successfully projects abstract fashion concepts onto human images and introduces exciting new applications to the field.

Related papers

Fashion-RAG: Multimodal Fashion Image Editing via Retrieval-Augmented Generation [36.66066619847558]
Fashion industry has increasingly adopted AI technologies to enhance customer experience. Fashion-RAG is first work to introduce a retrieval-augmented generation approach specifically tailored for multimodal fashion image editing.
arXiv Detail & Related papers (2025-04-18T18:02:33Z)
Prompt Augmentation for Self-supervised Text-guided Image Manipulation [34.01939157351624]
We introduce prompt augmentation, a method amplifying a single input prompt into several target prompts, strengthening textual context and enabling localised image editing. We propose a Contrastive Loss tailored to driving effective image editing by displacing edited areas and drawing preserved regions closer. New losses are incorporated to the diffusion model, demonstrating improved or competitive image editing results on public datasets and generated images over state-of-the-art approaches.
arXiv Detail & Related papers (2024-12-17T16:54:05Z)
TEDRA: Text-based Editing of Dynamic and Photoreal Actors [59.480513384611804]
TEDRA is the first method allowing text-based edits of an avatar. We train a model to create a controllable and high-fidelity digital replica of the real actor. We modify the dynamic avatar based on a provided text prompt.
arXiv Detail & Related papers (2024-08-28T17:59:02Z)
A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models [117.77807994397784]
Image editing aims to edit the given synthetic or real image to meet the specific requirements from users. Recent significant advancement in this field is based on the development of text-to-image (T2I) diffusion models. T2I-based image editing methods significantly enhance editing performance and offer a user-friendly interface for modifying content guided by multimodal inputs.
arXiv Detail & Related papers (2024-06-20T17:58:52Z)
Multimodal-Conditioned Latent Diffusion Models for Fashion Image Editing [40.70752781891058]
This paper tackles the task of multimodal-conditioned fashion image editing. Our approach aims to generate human-centric fashion images guided by multimodal prompts, including text, human body poses, garment sketches, and fabric textures.
arXiv Detail & Related papers (2024-03-21T20:43:10Z)
AdapEdit: Spatio-Temporal Guided Adaptive Editing Algorithm for Text-Based Continuity-Sensitive Image Editing [24.9487669818162]
We propose atemporal guided adaptive editing algorithm AdapEdit, which realizes adaptive image editing. Our approach has a significant advantage in preserving model priors and does not require model training, fine-tuning extra data, or optimization. We present our results over a wide variety of raw images and editing instructions, demonstrating competitive performance and showing it significantly outperforms the previous approaches.
arXiv Detail & Related papers (2023-12-13T09:45:58Z)
Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images. By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models. Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z)
Text-Driven Image Editing via Learnable Regions [74.45313434129005]
We introduce a method for region-based image editing driven by textual prompts, without the need for user-provided masks or sketches. We show that this simple approach enables flexible editing that is compatible with current image generation models. Experiments demonstrate the competitive performance of our method in manipulating images with high fidelity and realism that correspond to the provided language descriptions.
arXiv Detail & Related papers (2023-11-28T02:27:31Z)
ControlStyle: Text-Driven Stylized Image Generation Using Diffusion Priors [105.37795139586075]
We propose a new task for stylizing'' text-to-image models, namely text-driven stylized image generation. We present a new diffusion model (ControlStyle) via upgrading a pre-trained text-to-image model with a trainable modulation network. Experiments demonstrate the effectiveness of our ControlStyle in producing more visually pleasing and artistic results.
arXiv Detail & Related papers (2023-11-09T15:50:52Z)
Fashion Matrix: Editing Photos by Just Talking [66.83502497764698]
We develop a hierarchical AI system called Fashion Matrix dedicated to editing photos by just talking. Fashion Matrix employs Large Language Models (LLMs) as its foundational support and engages in iterative interactions with users. Visual Foundation Models are leveraged to generate edited images from text prompts and masks, thereby facilitating the automation of fashion editing processes.
arXiv Detail & Related papers (2023-07-25T04:06:25Z)
Text-guided Image-and-Shape Editing and Generation: A Short Survey [0.0]
In the recent advance of machine learning, artists' editing intents can even be driven by text. In this short survey, we provide an overview over 50 papers on state-of-the-art (text-guided) image-and-shape generation techniques.
arXiv Detail & Related papers (2023-04-18T19:11:36Z)
Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing [40.70752781891058]
We propose the task of multimodal-conditioned fashion image editing, guiding the generation of human-centric fashion images. We tackle this problem by proposing a new architecture based on latent diffusion models. Given the lack of existing datasets suitable for the task, we also extend two existing fashion datasets.
arXiv Detail & Related papers (2023-04-04T18:03:04Z)
FICE: Text-Conditioned Fashion Image Editing With Guided GAN Inversion [16.583537785874604]
We propose a novel text-conditioned editing model, called FICE, capable of handling a wide variety of diverse text descriptions. FICE generates highly realistic fashion images and leads to stronger editing performance than existing competing approaches.
arXiv Detail & Related papers (2023-01-05T15:33:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.