Fugu-MT 論文翻訳(概要): MOFA-VTON: More Fashion Possibilities with Fine-Grained Adaptations in Virtual Try-On

論文の概要: MOFA-VTON: More Fashion Possibilities with Fine-Grained Adaptations in Virtual Try-On

arxiv url: http://arxiv.org/abs/2606.11148v1
Date: Tue, 09 Jun 2026 17:34:25 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-10 15:40:58.647233
Title: MOFA-VTON: More Fashion Possibilities with Fine-Grained Adaptations in Virtual Try-On
Title（参考訳）: MOFA-VTON:仮想トライオンにおける微粒化適応によるよりファッションの可能性
Authors: Xiaoyu Han, Chenyang Wang, Jing Wang, Shunyuan Zheng, Quanling Meng, Shengping Zhang,
Abstract要約: そこで本研究では,MOFA-VTONと呼ばれる仮想試着手法を提案する。具体的には、まず、ユーザが描いた曲線のスケッチを2つの領域のマスクに変換するマスク構築戦略を設計し、従来の衣服に依存しないマスクを置き換える。また,人体上・下方領域のレイアウト対応を独立に学習するために,クロスアテンション機構を利用したレイアウト調整ブロックを提案する。
参考スコア（独自算出の注目度）: 26.851916074509827
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Virtual try-on aims to fit an in-shop clothing image onto a specific human body. An optimal virtual try-on method should provide diverse and flexible dressing options, accurately reflecting the varied wearing styles encountered in real-life scenarios, tailored to individual preferences and fashion aspirations. However, current methods predominantly perform a direct replacement of the original clothing with the target clothing, following the same dressing pattern. This limited control over clothing adaptation may result in fixed and monotonous try-on outputs. To delve into More Fashion Possibilities with Fine-Grained Adaptations in Virtual Try-On, we propose a novel virtual try-on method, termed MOFA-VTON, which allows adjustment for clothing adaptations in try-on results through simple sketches by users. Specifically, we first design a mask construction strategy that transforms user-drawn curve sketches into a dual-region mask, replacing the traditional clothing-agnostic mask and providing fine-grained layout guidance for the subsequent generation process. Further, we propose layout adjustment blocks that utilize the cross-attention mechanism to independently learn layout correspondences for upper and lower regions of the human body, refining the spatial arrangement of the two regions. With these implementations, our method enables flexible and fine-grained adaptations of target clothing, overcoming the constraints of a fixed layout. Extensive experiments on VITON-HD and DressCode datasets demonstrate that our proposed MOFA-VTON outperforms previous state-of-the-art methods and provides more fashion possibilities for virtual try-on.
Abstract（参考訳）: 仮想試着は、装着した衣服の画像を特定の人間の体に合わせることを目的としている。最適な仮想試着方法は、多様なフレキシブルなドレッシングオプションを提供し、現実のシナリオで遭遇する様々な服装スタイルを正確に反映し、個人の好みやファッションの願望に合わせて調整する必要がある。しかし、現在の方法では、同じ着物パターンに従って、元服を対象服に置き換えることが主流である。この服飾適応に対する制限された制御は、固定された単調な試着出力をもたらす可能性がある。仮想トライオンにおけるファイングラインド・アダプティシビリティ(Fashion Possibilities)の向上を図るために,ユーザによる簡単なスケッチによる試着結果の調整が可能な,MOFA-VTONと呼ばれる新しい仮想トライオン手法を提案する。具体的には、まず、ユーザが描いた曲線のスケッチを2つの領域のマスクに変換するマスク構築戦略を設計し、従来の衣服に依存しないマスクを置き換えるとともに、その後の生成プロセスのためのきめ細かいレイアウトガイダンスを提供する。さらに, クロスアテンション機構を用いて, 人体上・下方領域のレイアウト対応を独立に学習し, 空間配置を改良するレイアウト調整ブロックを提案する。これらの実装により、固定レイアウトの制約を克服し、柔軟できめ細かな目標服の適応を可能にする。 VITON-HDとDressCodeのデータセットに関する大規模な実験は、提案したMOFA-VTONが従来の最先端の手法よりも優れており、仮想トライオンのファッション性も向上していることを示している。

論文の概要: MOFA-VTON: More Fashion Possibilities with Fine-Grained Adaptations in Virtual Try-On

関連論文リスト