Related papers: GlamTry: Advancing Virtual Try-On for High-End Accessories

GlamTry: Advancing Virtual Try-On for High-End Accessories

URL: http://arxiv.org/abs/2409.14553v1
Date: Sun, 22 Sep 2024 18:29:32 GMT
Title: GlamTry: Advancing Virtual Try-On for High-End Accessories
Authors: Ting-Yu Chang, Seretsi Khabane Lekena,
Abstract summary: Existing virtual try-on models focus primarily on clothing items, but there is a gap in the market for accessories. This research explores the application of techniques from 2D virtual try-on models for clothing, such as VITON-HD, and integrates them with other computer vision models. Results demonstrate improved location prediction compared to the original model for clothes, even with a small dataset.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The paper aims to address the lack of photorealistic virtual try-on models for accessories such as jewelry and watches, which are particularly relevant for online retail applications. While existing virtual try-on models focus primarily on clothing items, there is a gap in the market for accessories. This research explores the application of techniques from 2D virtual try-on models for clothing, such as VITON-HD, and integrates them with other computer vision models, notably MediaPipe Hand Landmarker. Drawing on existing literature, the study customizes and retrains a unique model using accessory-specific data and network architecture modifications to assess the feasibility of extending virtual try-on technology to accessories. Results demonstrate improved location prediction compared to the original model for clothes, even with a small dataset. This underscores the model's potential with larger datasets exceeding 10,000 images, paving the way for future research in virtual accessory try-on applications.

Related papers

Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals [76.96387718150542]
We present Text-Enhanced MUlti-category Virtual Try-Off (TEMU-VTOFF)<n>Our architecture is designed to receive garment information from multiple modalities like images, text, and masks to work in a multi-category setting.<n> Experiments on VITON-HD and Dress Code datasets show that TEMU-VTOFF sets a new state-of-the-art on the VTOFF task.
arXiv Detail & Related papers (2025-05-27T11:47:51Z)
VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction [103.0918705283309]
Virtual Try-On (VTON) is a transformative technology in e-commerce and fashion design, enabling realistic digital visualization of clothing on individuals. We propose VTON 360, a novel 3D VTON method that addresses the open challenge of achieving high-fidelity VTON that supports any-view rendering.
arXiv Detail & Related papers (2025-03-15T15:08:48Z)
TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models [8.158200403139196]
This paper introduces Virtual Try-Off (VTOFF), a novel task focused on generating standardized garment images from single photos of clothed individuals. We present TryOffDiff, a model that adapts Stable Diffusion with SigLIP-based visual conditioning to ensure high fidelity and detail retention. Our results highlight the potential of VTOFF to enhance product imagery in e-commerce applications, advance generative model evaluation, and inspire future work on high-fidelity reconstruction.
arXiv Detail & Related papers (2024-11-27T13:53:09Z)
SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation [82.61572106180705]
This paper presents a unified approach using vision-language models (VLMs) to improve keypoint prediction across various garment categories. We created a large-scale synthetic dataset using advanced simulation techniques, allowing scalable training without extensive real-world data. Experimental results indicate that the VLM-based method significantly enhances keypoint detection accuracy and task success rates.
arXiv Detail & Related papers (2024-09-26T17:26:16Z)
Self-Supervised Vision Transformer for Enhanced Virtual Clothes Try-On [21.422611451978863]
We introduce an innovative approach for virtual clothes try-on, utilizing a self-supervised Vision Transformer (ViT) and a diffusion model. Our method emphasizes detail enhancement by contrasting local clothing image embeddings, generated by ViT, with their global counterparts. The experimental results showcase substantial advancements in the realism and precision of details in virtual try-on experiences.
arXiv Detail & Related papers (2024-06-15T07:46:22Z)
AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario [50.62711489896909]
AnyFit surpasses all baselines on high-resolution benchmarks and real-world data by a large gap. AnyFit's impressive performance on high-fidelity virtual try-ons in any scenario from any image, paves a new path for future research within the fashion community.
arXiv Detail & Related papers (2024-05-28T13:33:08Z)
FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation [7.483981721542115]
FashionFail is a new dataset with e-commerce images for object detection and segmentation. Our analysis reveals the shortcomings of leading models, such as Attribute-Mask R-CNN and Fashionformer. We propose a baseline approach using naive data augmentation to mitigate common failure cases and improve model robustness.
arXiv Detail & Related papers (2024-04-12T16:28:30Z)
Mobile Fitting Room: On-device Virtual Try-on via Diffusion Models [19.10976982327356]
Mobile Fitting Room is the first on-device diffusion-based virtual try-on system. A usage scenario highlights how our tool can provide a seamless, interactive virtual try-on experience for customers.
arXiv Detail & Related papers (2024-02-02T20:05:45Z)
DM-VTON: Distilled Mobile Real-time Virtual Try-On [16.35842298296878]
Distilled Mobile Real-time Virtual Try-On (DM-VTON) is a novel virtual try-on framework designed to achieve simplicity and efficiency. We introduce an efficient Mobile Generative Module within the Student network, significantly reducing the runtime. Experimental results show that the proposed method can achieve 40 frames per second on a single Nvidia Tesla T4 GPU.
arXiv Detail & Related papers (2023-08-26T07:46:27Z)
Objaverse: A Universe of Annotated 3D Objects [53.2537614157313]
We present averse 1.0, a large dataset of objects with 800K+ (and growing) 3D models with descriptive tags, captions and animations. We demonstrate the large potential of averse 3D models via four applications: training diverse 3D models, improving tail category segmentation on the LVIS benchmark, training open-vocabulary object-navigation models for Embodied vision models, and creating a new benchmark for robustness analysis of vision models.
arXiv Detail & Related papers (2022-12-15T18:56:53Z)
Multiface: A Dataset for Neural Face Rendering [108.44505415073579]
In this work, we present Multiface, a new multi-view, high-resolution human face dataset. We introduce Mugsy, a large scale multi-camera apparatus to capture high-resolution synchronized videos of a facial performance. The goal of Multiface is to close the gap in accessibility to high quality data in the academic community and to enable research in VR telepresence.
arXiv Detail & Related papers (2022-07-22T17:55:39Z)
Cloth Interactive Transformer for Virtual Try-On [106.21605249649957]
We propose a novel two-stage cloth interactive transformer (CIT) method for the virtual try-on task. In the first stage, we design a CIT matching block, aiming to precisely capture the long-range correlations between the cloth-agnostic person information and the in-shop cloth information. In the second stage, we put forth a CIT reasoning block for establishing global mutual interactive dependencies among person representation, the warped clothing item, and the corresponding warped cloth mask.
arXiv Detail & Related papers (2021-04-12T14:45:32Z)
Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single Images [50.34202789543989]
Deep Fashion3D is the largest collection to date of 3D garment models. It provides rich annotations including 3D feature lines, 3D body pose and the corresponded multi-view real images. A novel adaptable template is proposed to enable the learning of all types of clothing in a single network.
arXiv Detail & Related papers (2020-03-28T09:20:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.