Training-Free Consistency Pipeline for Fashion Repose
- URL: http://arxiv.org/abs/2501.13692v1
- Date: Thu, 23 Jan 2025 14:17:01 GMT
- Title: Training-Free Consistency Pipeline for Fashion Repose
- Authors: Potito Aghilar, Vito Walter Anelli, Michelantonio Trizio, Tommaso Di Noia,
- Abstract summary: FashionRepose is a training-free pipeline for non-rigid pose editing.<n>It integrates off-the-shelf models to adjust poses of long-sleeve garments, maintaining identity and branding attributes.<n>FashionRepose uses a zero-shot approach to perform these edits in near real-time, eliminating the need for specialized training.
- Score: 9.61065600471628
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent advancements in diffusion models have significantly broadened the possibilities for editing images of real-world objects. However, performing non-rigid transformations, such as changing the pose of objects or image-based conditioning, remains challenging. Maintaining object identity during these edits is difficult, and current methods often fall short of the precision needed for industrial applications, where consistency is critical. Additionally, fine-tuning diffusion models requires custom training data, which is not always accessible in real-world scenarios. This work introduces FashionRepose, a training-free pipeline for non-rigid pose editing specifically designed for the fashion industry. The approach integrates off-the-shelf models to adjust poses of long-sleeve garments, maintaining identity and branding attributes. FashionRepose uses a zero-shot approach to perform these edits in near real-time, eliminating the need for specialized training. consistent image editing. The solution holds potential for applications in the fashion industry and other fields demanding identity preservation in image editing.
Related papers
- Teleportraits: Training-Free People Insertion into Any Scene [59.76038137014233]
We introduce a unified training-free pipeline that leverages pre-trained text-to-image diffusion models.<n>We show that diffusion models inherently possess the knowledge to place people in complex scenes without requiring task-specific training.<n>Our method achieves affordance-aware global editing, seamlessly inserting people into scenes.
arXiv Detail & Related papers (2025-10-07T08:12:57Z) - FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization [0.29998889086656577]
We introduce FashionPose, the first unified text-to-pose-to-relighting generation framework.<n>By replacing explicit pose annotations with text-driven conditioning, FashionPose enables accurate pose alignment, faithful garment rendering, and flexible lighting control.<n>Experiments demonstrate fine-grained pose synthesis and efficient, consistent relighting, providing a practical solution for personalized virtual fashion display.
arXiv Detail & Related papers (2025-07-17T17:30:29Z) - Image-Editing Specialists: An RLAIF Approach for Diffusion Models [28.807572302899004]
We present a novel approach to training specialized instruction-based image-editing diffusion models.
We introduce an online reinforcement learning framework that aligns the diffusion model with human preferences.
Experimental results demonstrate that our models can perform intricate edits in complex scenes, after just 10 training steps.
arXiv Detail & Related papers (2025-04-17T10:46:39Z) - UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency [69.33072075580483]
We propose an unsupervised model for instruction-based image editing that eliminates the need for ground-truth edited images during training.
Our method addresses these challenges by introducing a novel editing mechanism called Cycle Edit Consistency ( CEC)
CEC applies forward and backward edits in one training step and enforces consistency in image and attention spaces.
arXiv Detail & Related papers (2024-12-19T18:59:58Z) - INRetouch: Context Aware Implicit Neural Representation for Photography Retouching [54.17599183365242]
We propose a novel retouch transfer approach that learns from professional edits through before-after image pairs.<n>We develop a context-aware Implicit Neural Representation that learns to apply edits adaptively based on image content and context.<n>Our approach not only surpasses existing methods in photo retouching but also enhances performance in related image reconstruction tasks.
arXiv Detail & Related papers (2024-12-05T03:31:48Z) - Learning Feature-Preserving Portrait Editing from Generated Pairs [11.122956539965761]
We propose a training-based method leveraging auto-generated paired data to learn desired editing.
Our method achieves state-of-the-art quality, quantitatively and qualitatively.
arXiv Detail & Related papers (2024-07-29T23:19:42Z) - Editing 3D Scenes via Text Prompts without Retraining [80.57814031701744]
DN2N is a text-driven editing method that allows for the direct acquisition of a NeRF model with universal editing capabilities.
Our method employs off-the-shelf text-based editing models of 2D images to modify the 3D scene images.
Our method achieves multiple editing types, including but not limited to appearance editing, weather transition, material changing, and style transfer.
arXiv Detail & Related papers (2023-09-10T02:31:50Z) - Fashion Matrix: Editing Photos by Just Talking [66.83502497764698]
We develop a hierarchical AI system called Fashion Matrix dedicated to editing photos by just talking.
Fashion Matrix employs Large Language Models (LLMs) as its foundational support and engages in iterative interactions with users.
Visual Foundation Models are leveraged to generate edited images from text prompts and masks, thereby facilitating the automation of fashion editing processes.
arXiv Detail & Related papers (2023-07-25T04:06:25Z) - Realistic Saliency Guided Image Enhancement [32.446298454642985]
Common editing operations performed by professional photographers include de-emphasizing distracting elements and enhancing subjects.
We propose a realism loss for saliency-guided image enhancement to maintain high realism across varying image types.
We outperform the recent approaches on their own datasets, while requiring a smaller memory footprint and runtime.
arXiv Detail & Related papers (2023-06-09T17:52:34Z) - ReGeneration Learning of Diffusion Models with Rich Prompts for
Zero-Shot Image Translation [8.803251014279502]
Large-scale text-to-image models have demonstrated amazing ability to synthesize diverse and high-fidelity images.
Current models can impose significant changes to the original image content during the editing process.
We propose ReGeneration learning in an image-to-image Diffusion model (ReDiffuser)
arXiv Detail & Related papers (2023-05-08T12:08:12Z) - Fashion-model pose recommendation and generation using Machine Learning [0.0]
This research concentrates on suggesting the fashion personnel a series of similar images based on the input image.
The image is segmented into different parts and similar images are suggested for the user.
This was achieved by calculating the color histogram of the input image and applying the same for all the images in the dataset.
arXiv Detail & Related papers (2023-02-19T09:12:46Z) - Zero-shot Image-to-Image Translation [57.46189236379433]
We propose pix2pix-zero, an image-to-image translation method that can preserve the original image without manual prompting.
We propose cross-attention guidance, which aims to retain the cross-attention maps of the input image throughout the diffusion process.
Our method does not need additional training for these edits and can directly use the existing text-to-image diffusion model.
arXiv Detail & Related papers (2023-02-06T18:59:51Z) - Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space
Navigation [136.53288628437355]
Controllable semantic image editing enables a user to change entire image attributes with few clicks.
Current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism.
We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation.
arXiv Detail & Related papers (2021-02-01T21:38:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.