Dressing the Imagination: A Dataset for AI-Powered Translation of Text into Fashion Outfits and A Novel KAN Adapter for Enhanced Feature Adaptation
- URL: http://arxiv.org/abs/2411.13901v1
- Date: Thu, 21 Nov 2024 07:27:45 GMT
- Title: Dressing the Imagination: A Dataset for AI-Powered Translation of Text into Fashion Outfits and A Novel KAN Adapter for Enhanced Feature Adaptation
- Authors: Gayatri Deshmukh, Somsubhra De, Chirag Sehgal, Jishu Sen Gupta, Sparsh Mittal,
- Abstract summary: We present FLORA, the first comprehensive dataset containing 4,330 curated pairs of fashion outfits and corresponding textual descriptions.
As a second contribution, we introduce KAN Adapters, which leverage Kolmogorov-Arnold Networks (KAN) as adaptive modules.
To foster further research and collaboration, we will open-source both the FLORA and our implementation code.
- Score: 2.3010373219231495
- License:
- Abstract: Specialized datasets that capture the fashion industry's rich language and styling elements can boost progress in AI-driven fashion design. We present FLORA (Fashion Language Outfit Representation for Apparel Generation), the first comprehensive dataset containing 4,330 curated pairs of fashion outfits and corresponding textual descriptions. Each description utilizes industry-specific terminology and jargon commonly used by professional fashion designers, providing precise and detailed insights into the outfits. Hence, the dataset captures the delicate features and subtle stylistic elements necessary to create high-fidelity fashion designs. We demonstrate that fine-tuning generative models on the FLORA dataset significantly enhances their capability to generate accurate and stylistically rich images from textual descriptions of fashion sketches. FLORA will catalyze the creation of advanced AI models capable of comprehending and producing subtle, stylistically rich fashion designs. It will also help fashion designers and end-users to bring their ideas to life. As a second orthogonal contribution, we introduce KAN Adapters, which leverage Kolmogorov-Arnold Networks (KAN) as adaptive modules. They serve as replacements for traditional MLP-based LoRA adapters. With learnable spline-based activations, KAN Adapters excel in modeling complex, non-linear relationships, achieving superior fidelity, faster convergence and semantic alignment. Extensive experiments and ablation studies on our proposed FLORA dataset validate the superiority of KAN Adapters over LoRA adapters. To foster further research and collaboration, we will open-source both the FLORA and our implementation code.
Related papers
- Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference [4.667044856219814]
This paper presents a novel framework that harnesses the expressive power of large language models (LLMs) for personalized outfit recommendations.
We bridge the item visual-textual gap in items descriptions by employing image captioning with a Multimodal Large Language Model (MLLM)
The framework is evaluated on the Polyvore dataset, demonstrating its effectiveness in two key tasks: fill-in-the-blank, and complementary item retrieval.
arXiv Detail & Related papers (2024-09-18T17:15:06Z) - ArtWeaver: Advanced Dynamic Style Integration via Diffusion Model [73.95608242322949]
Stylized Text-to-Image Generation (STIG) aims to generate images from text prompts and style reference images.
We present ArtWeaver, a novel framework that leverages pretrained Stable Diffusion to address challenges such as misinterpreted styles and inconsistent semantics.
arXiv Detail & Related papers (2024-05-24T07:19:40Z) - FashionReGen: LLM-Empowered Fashion Report Generation [61.84580616045145]
We propose an intelligent Fashion Analyzing and Reporting system based on advanced Large Language Models (LLMs)
Specifically, it tries to deliver FashionReGen based on effective catwalk analysis, which is equipped with several key procedures.
It also inspires the explorations of more high-level tasks with industrial significance in other domains.
arXiv Detail & Related papers (2024-03-11T12:29:35Z) - HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced
Diffusion Models [84.12784265734238]
The goal of Arbitrary Style Transfer (AST) is injecting the artistic features of a style reference into a given image/video.
We propose HiCAST, which is capable of explicitly customizing the stylization results according to various source of semantic clues.
A novel learning objective is leveraged for video diffusion model training, which significantly improve cross-frame temporal consistency.
arXiv Detail & Related papers (2024-01-11T12:26:23Z) - FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and
Design [10.556799226837535]
We introduce a new dataset comprising a million high-resolution fashion images with rich structured textual(FIRST) descriptions.
Experiments on prevalent generative models trained over FISRT show the necessity of FIRST.
We invite the community to further develop more intelligent fashion synthesis and design systems.
arXiv Detail & Related papers (2023-11-13T15:50:25Z) - Lost Your Style? Navigating with Semantic-Level Approach for
Text-to-Outfit Retrieval [2.07180164747172]
We introduce a groundbreaking approach to fashion recommendations: text-to-outfit retrieval task that generates a complete outfit set based solely on textual descriptions.
Our model is devised at three semantic levels-item, style, and outfit-where each level progressively aggregates data to form a coherent outfit recommendation.
Using the Maryland Polyvore and Polyvore Outfit datasets, our approach significantly outperformed state-of-the-art models in text-video retrieval tasks.
arXiv Detail & Related papers (2023-11-03T07:23:21Z) - FashionSAP: Symbols and Attributes Prompt for Fine-grained Fashion
Vision-Language Pre-training [12.652002299515864]
We propose a method for fine-grained fashion vision-language pre-training based on fashion Symbols and Attributes Prompt (FashionSAP)
Firstly, we propose the fashion symbols, a novel abstract fashion concept layer, to represent different fashion items.
Secondly, the attributes prompt method is proposed to make the model learn specific attributes of fashion items explicitly.
arXiv Detail & Related papers (2023-04-11T08:20:17Z) - Fashionformer: A simple, Effective and Unified Baseline for Human
Fashion Segmentation and Recognition [80.74495836502919]
In this work, we focus on joint human fashion segmentation and attribute recognition.
We introduce the object query for segmentation and the attribute query for attribute prediction.
For attribute stream, we design a novel Multi-Layer Rendering module to explore more fine-grained features.
arXiv Detail & Related papers (2022-04-10T11:11:10Z) - DRAN: Detailed Region-Adaptive Normalization for Conditional Image
Synthesis [25.936764522125703]
We propose a novel normalization module, named Detailed Region-Adaptive Normalization(DRAN)
It adaptively learns both fine-grained and coarse-grained style representations.
We collect a new makeup dataset (Makeup-Complex dataset) that contains a wide range of complex makeup styles.
arXiv Detail & Related papers (2021-09-29T16:19:37Z) - Knowledge Enhanced Neural Fashion Trend Forecasting [81.2083786318119]
This work focuses on investigating fine-grained fashion element trends for specific user groups.
We first contribute a large-scale fashion trend dataset (FIT) collected from Instagram with extracted time series fashion element records and user information.
We propose a Knowledge EnhancedRecurrent Network model (KERN) which takes advantage of the capability of deep recurrent neural networks in modeling time-series data.
arXiv Detail & Related papers (2020-05-07T07:42:17Z) - Learning Diverse Fashion Collocation by Neural Graph Filtering [78.9188246136867]
We propose a novel fashion collocation framework, Neural Graph Filtering, that models a flexible set of fashion items via a graph neural network.
By applying symmetric operations on the edge vectors, this framework allows varying numbers of inputs/outputs and is invariant to their ordering.
We evaluate the proposed approach on three popular benchmarks, the Polyvore dataset, the Polyvore-D dataset, and our reorganized Amazon Fashion dataset.
arXiv Detail & Related papers (2020-03-11T16:17:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.