NGL-Prompter: Training-Free Sewing Pattern Estimation from a Single Image
- URL: http://arxiv.org/abs/2602.20700v1
- Date: Tue, 24 Feb 2026 09:01:11 GMT
- Title: NGL-Prompter: Training-Free Sewing Pattern Estimation from a Single Image
- Authors: Anna Badalyan, Pratheba Selvaraju, Giorgio Becherini, Omid Taheri, Victoria Fernandez Abrevaya, Michael Black,
- Abstract summary: Estimating sewing patterns from images is a practical approach for creating high-quality 3D garments.<n>We propose NGL (Natural Garment Language), a novel intermediate language that restructures GarmentCode into a representation more understandable to language models.<n>We evaluate our method on the Dress4D, CloSe and a newly collected dataset of approximately 5,000 in-the-wild fashion images.
- Score: 4.620470560214746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating sewing patterns from images is a practical approach for creating high-quality 3D garments. Due to the lack of real-world pattern-image paired data, prior approaches fine-tune large vision language models (VLMs) on synthetic garment datasets generated by randomly sampling from a parametric garment model GarmentCode. However, these methods often struggle to generalize to in-the-wild images, fail to capture real-world correlations between garment parts, and are typically restricted to single-layer outfits. In contrast, we observe that VLMs are effective at describing garments in natural language, yet perform poorly when asked to directly regress GarmentCode parameters from images. To bridge this gap, we propose NGL (Natural Garment Language), a novel intermediate language that restructures GarmentCode into a representation more understandable to language models. Leveraging this language, we introduce NGL-Prompter, a training-free pipeline that queries large VLMs to extract structured garment parameters, which are then deterministically mapped to valid GarmentCode. We evaluate our method on the Dress4D, CloSe and a newly collected dataset of approximately 5,000 in-the-wild fashion images. Our approach achieves state-of-the-art performance on standard geometry metrics and is strongly preferred in both human and GPT-based perceptual evaluations compared to existing baselines. Furthermore, NGL-prompter can recover multi-layer outfits whereas competing methods focus mostly on single-layer garments, highlighting its strong generalization to real-world images even with occluded parts. These results demonstrate that accurate sewing pattern reconstruction is possible without costly model training. Our code and data will be released for research use.
Related papers
- GarmentPile++: Affordance-Driven Cluttered Garments Retrieval with Vision-Language Reasoning [27.756766557197746]
Garment manipulation has attracted increasing attention due to its critical role in home-assistant robotics.<n>We propose a novel garment retrieval pipeline that can not only follow language instruction to execute safe and clean retrieval but also guarantee exactly one garment is retrieved per attempt.<n>Our pipeline seamlessly integrates vision-language reasoning with visual affordance perception, fully leveraging the high-level reasoning and planning capabilities of VLMs.
arXiv Detail & Related papers (2026-03-04T15:13:40Z) - DressWild: Feed-Forward Pose-Agnostic Garment Sewing Pattern Generation from In-the-Wild Images [50.11081091174558]
This paper focuses on sewing pattern generation for garment modeling and fabrication applications.<n>We propose DressWild, a novel feed-forward pipeline that reconstructs physics-consistent 2D sewing patterns and the corresponding 3D garments from a single in-the-wild image.
arXiv Detail & Related papers (2026-02-18T14:45:15Z) - ChatGarment: Garment Estimation, Generation and Editing via Large Language Models [79.46056192947924]
ChatGarment is a novel approach that leverages large vision-language models (VLMs) to automate the estimation, generation, and editing of 3D garments.<n>It can estimate sewing patterns from in-the-wild images or sketches, generate them from text descriptions, and edit garments based on user instructions.
arXiv Detail & Related papers (2024-12-23T18:59:28Z) - GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details [21.959372614365908]
GarVerseLOD aims to achieve unprecedented robustness in high-fidelity 3D garment reconstruction from a single unconstrained image.
GarVerseLOD collects 6,000 high-quality cloth models with fine-grained geometry details manually created by professional artists.
We propose a novel labeling paradigm based on conditional diffusion models to generate extensive paired images for each garment model with high photorealism.
arXiv Detail & Related papers (2024-11-05T12:30:07Z) - SPnet: Estimating Garment Sewing Patterns from a Single Image [10.604555099281173]
This paper presents a novel method for reconstructing 3D garment models from a single image of a posed user.
By inferring the fundamental shape of the garment through sewing patterns from a single image, we can generate 3D garments that can adaptively deform to arbitrary poses.
arXiv Detail & Related papers (2023-12-26T09:51:25Z) - StableVITON: Learning Semantic Correspondence with Latent Diffusion
Model for Virtual Try-On [35.227896906556026]
Given a clothing image and a person image, an image-based virtual try-on aims to generate a customized image that appears natural and accurately reflects the characteristics of the clothing image.
In this work, we aim to expand the applicability of the pre-trained diffusion model so that it can be utilized independently for the virtual try-on task.
Our proposed zero cross-attention blocks not only preserve the clothing details by learning the semantic correspondence but also generate high-fidelity images by utilizing the inherent knowledge of the pre-trained model in the warping process.
arXiv Detail & Related papers (2023-12-04T08:27:59Z) - Towards Garment Sewing Pattern Reconstruction from a Single Image [76.97825595711444]
Garment sewing pattern represents the intrinsic rest shape of a garment, and is the core for many applications like fashion design, virtual try-on, and digital avatars.
We first synthesize a versatile dataset, named SewFactory, which consists of around 1M images and ground-truth sewing patterns.
We then propose a two-level Transformer network called Sewformer, which significantly improves the sewing pattern prediction performance.
arXiv Detail & Related papers (2023-11-07T18:59:51Z) - Style-Based Global Appearance Flow for Virtual Try-On [119.95115739956661]
A novel global appearance flow estimation model is proposed in this work.
Experiment results on a popular virtual try-on benchmark show that our method achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-04-03T10:58:04Z) - Arbitrary Virtual Try-On Network: Characteristics Preservation and
Trade-off between Body and Clothing [85.74977256940855]
We propose an Arbitrary Virtual Try-On Network (AVTON) for all-type clothes.
AVTON can synthesize realistic try-on images by preserving and trading off characteristics of the target clothes and the reference person.
Our approach can achieve better performance compared with the state-of-the-art virtual try-on methods.
arXiv Detail & Related papers (2021-11-24T08:59:56Z) - BCNet: Learning Body and Cloth Shape from A Single Image [56.486796244320125]
We propose a layered garment representation on top of SMPL and novelly make the skinning weight of garment independent of the body mesh.
Compared with existing methods, our method can support more garment categories and recover more accurate geometry.
arXiv Detail & Related papers (2020-04-01T03:41:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.