Related papers: FCBoost-Net: A Generative Network for Synthesizing Multiple Collocated Outfits via Fashion Compatibility Boosting

FCBoost-Net: A Generative Network for Synthesizing Multiple Collocated Outfits via Fashion Compatibility Boosting

URL: http://arxiv.org/abs/2502.00992v1
Date: Mon, 03 Feb 2025 02:18:09 GMT
Title: FCBoost-Net: A Generative Network for Synthesizing Multiple Collocated Outfits via Fashion Compatibility Boosting
Authors: Dongliang Zhou, Haijun Zhang, Jianghong Ma, Jicong Fan, Zhao Zhang,
Abstract summary: We present FCBoost-Net, a new framework for outfit generation that leverages the power of pre-trained generative models.<n>FCBoost-Net randomly synthesizes multiple sets of fashion items, and the compatibility of the synthesized sets is then improved in several rounds using a novel fashion compatibility booster.<n> Empirical evidence indicates that the proposed strategy can improve the fashion compatibility of randomly synthesized fashion items as well as maintain their diversity.
Score: 37.32190866187711
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Outfit generation is a challenging task in the field of fashion technology, in which the aim is to create a collocated set of fashion items that complement a given set of items. Previous studies in this area have been limited to generating a unique set of fashion items based on a given set of items, without providing additional options to users. This lack of a diverse range of choices necessitates the development of a more versatile framework. However, when the task of generating collocated and diversified outfits is approached with multimodal image-to-image translation methods, it poses a challenging problem in terms of non-aligned image translation, which is hard to address with existing methods. In this research, we present FCBoost-Net, a new framework for outfit generation that leverages the power of pre-trained generative models to produce multiple collocated and diversified outfits. Initially, FCBoost-Net randomly synthesizes multiple sets of fashion items, and the compatibility of the synthesized sets is then improved in several rounds using a novel fashion compatibility booster. This approach was inspired by boosting algorithms and allows the performance to be gradually improved in multiple steps. Empirical evidence indicates that the proposed strategy can improve the fashion compatibility of randomly synthesized fashion items as well as maintain their diversity. Extensive experiments confirm the effectiveness of our proposed framework with respect to visual authenticity, diversity, and fashion compatibility.

Related papers

Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation [54.588082888166504]
We present Mogao, a unified framework that enables interleaved multi-modal generation through a causal approach.<n>Mogoo integrates a set of key technical improvements in architecture design, including a deep-fusion design, dual vision encoders, interleaved rotary position embeddings, and multi-modal classifier-free guidance.<n>Experiments show that Mogao achieves state-of-the-art performance in multi-modal understanding and text-to-image generation, but also excels in producing high-quality, coherent interleaved outputs.
arXiv Detail & Related papers (2025-05-08T17:58:57Z)
Fashion-RAG: Multimodal Fashion Image Editing via Retrieval-Augmented Generation [36.66066619847558]
Fashion industry has increasingly adopted AI technologies to enhance customer experience. Fashion-RAG is first work to introduce a retrieval-augmented generation approach specifically tailored for multimodal fashion image editing.
arXiv Detail & Related papers (2025-04-18T18:02:33Z)
FashionDPO:Fine-tune Fashion Outfit Generation Model using Direct Preference Optimization [12.096130595139364]
We propose a novel framework, FashionDPO, which fine-tunes the fashion outfit generation model using direct preference optimization. This framework aims to provide a general fine-tuning approach to fashion generative models, without the need to design a task-specific reward function. Experiments on two datasets, ie iFashion and Polyvore-U, demonstrate the effectiveness of our framework in enhancing the model's ability to align with users' personalized preferences.
arXiv Detail & Related papers (2025-04-17T12:41:41Z)
COutfitGAN: Learning to Synthesize Compatible Outfits Supervised by Silhouette Masks and Fashion Styles [23.301719420997927]
We propose the new task of generating complementary and compatible fashion items based on an arbitrary number of given fashion items. In particular, given some fashion items that can make up an outfit, the aim of this paper is to synthesize photo-realistic images of other, complementary, fashion items that are compatible with the given ones. To achieve this, we propose an outfit generation framework, referred to as COutfitGAN, which includes a pyramid style extractor, an outfit generator, a UNet-based real/fake discriminator, and a collocation discriminator.
arXiv Detail & Related papers (2025-02-12T03:32:28Z)
Learning to Synthesize Compatible Fashion Items Using Semantic Alignment and Collocation Classification: An Outfit Generation Framework [59.09707044733695]
We propose a novel outfit generation framework, i.e., OutfitGAN, with the aim of synthesizing an entire outfit. OutfitGAN includes a semantic alignment module, which is responsible for characterizing the mapping correspondence between the existing fashion items and the synthesized ones. In order to evaluate the performance of our proposed models, we built a large-scale dataset consisting of 20,000 fashion outfits.
arXiv Detail & Related papers (2025-02-05T12:13:53Z)
BC-GAN: A Generative Adversarial Network for Synthesizing a Batch of Collocated Clothing [17.91576511810969]
Collocated clothing synthesis using generative networks has significant potential economic value to increase revenue in the fashion industry.<n>We introduce a novel batch clothing generation framework, named BC-GAN, which is able to synthesize multiple visually-collocated clothing images simultaneously.<n>Our model was examined in a large-scale dataset with compatible outfits constructed by ourselves.
arXiv Detail & Related papers (2025-02-03T05:41:41Z)
UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation [29.489516715874306]
We present UniFashion, a unified framework that simultaneously tackles the challenges of multimodal generation and retrieval tasks within the fashion domain. Our model significantly outperforms previous single-task state-of-the-art models across diverse fashion tasks.
arXiv Detail & Related papers (2024-08-21T03:17:20Z)
Multi-Garment Customized Model Generation [3.1679243514285194]
Multi-Garment Customized Model Generation is a unified framework based on Latent Diffusion Models (LDMs) Our framework supports the conditional generation of multiple garments through decoupled multi-garment feature fusion. The proposed garment encoder is a plug-and-play module that can be combined with other extension modules.
arXiv Detail & Related papers (2024-08-09T17:57:33Z)
AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario [50.62711489896909]
AnyFit surpasses all baselines on high-resolution benchmarks and real-world data by a large gap. AnyFit's impressive performance on high-fidelity virtual try-ons in any scenario from any image, paves a new path for future research within the fashion community.
arXiv Detail & Related papers (2024-05-28T13:33:08Z)
MMTryon: Multi-Modal Multi-Reference Control for High-Quality Fashion Generation [70.83668869857665]
MMTryon is a multi-modal multi-reference VIrtual Try-ON framework. It can generate high-quality compositional try-on results by taking a text instruction and multiple garment images as inputs.
arXiv Detail & Related papers (2024-05-01T11:04:22Z)
FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning [66.38951790650887]
Multimodal tasks in the fashion domain have significant potential for e-commerce. We propose a novel fashion-specific pre-training framework based on weakly-supervised triplets constructed from fashion image-text pairs. We show the triplet-based tasks are an effective addition to standard multimodal pre-training tasks.
arXiv Detail & Related papers (2022-10-26T21:01:19Z)
M6-Fashion: High-Fidelity Multi-modal Image Generation and Editing [51.033376763225675]
We adapt style prior knowledge and flexibility of multi-modal control into one unified two-stage framework, M6-Fashion, focusing on the practical AI-aided Fashion design. M6-Fashion utilizes self-correction for the non-autoregressive generation to improve inference speed, enhance holistic consistency, and support various signal controls.
arXiv Detail & Related papers (2022-05-24T01:18:14Z)
Learning Diverse Fashion Collocation by Neural Graph Filtering [78.9188246136867]
We propose a novel fashion collocation framework, Neural Graph Filtering, that models a flexible set of fashion items via a graph neural network. By applying symmetric operations on the edge vectors, this framework allows varying numbers of inputs/outputs and is invariant to their ordering. We evaluate the proposed approach on three popular benchmarks, the Polyvore dataset, the Polyvore-D dataset, and our reorganized Amazon Fashion dataset.
arXiv Detail & Related papers (2020-03-11T16:17:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.