Fugu-MT 論文翻訳(概要): DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models

論文の概要: DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models

arxiv url: http://arxiv.org/abs/2505.06166v1
Date: Fri, 09 May 2025 16:16:42 GMT
ステータス: 翻訳完了
システム内更新日: 2025-05-12 20:40:10.335083
Title: DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models
Title（参考訳）: DiffLocks:拡散モデルを用いた単一画像から3Dヘアを生成する
Authors: Radu Alexandru Rosu, Keyu Wu, Yao Feng, Youyi Zheng, Michael J. Black,
Abstract要約: DiffLocksは,単一画像から直接,多様なヘアスタイルの再構築を可能にする新しいフレームワークである。まず,40Kのヘアスタイルを含む最大合成ヘアデータセットの作成を自動化することで,3Dヘアデータの欠如に対処する。予め訓練した画像バックボーンを用いることで,合成データのみを訓練しながら,画像の幅内への一般化を行う。
参考スコア（独自算出の注目度）: 53.08138861924767
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We address the task of generating 3D hair geometry from a single image, which is challenging due to the diversity of hairstyles and the lack of paired image-to-3D hair data. Previous methods are primarily trained on synthetic data and cope with the limited amount of such data by using low-dimensional intermediate representations, such as guide strands and scalp-level embeddings, that require post-processing to decode, upsample, and add realism. These approaches fail to reconstruct detailed hair, struggle with curly hair, or are limited to handling only a few hairstyles. To overcome these limitations, we propose DiffLocks, a novel framework that enables detailed reconstruction of a wide variety of hairstyles directly from a single image. First, we address the lack of 3D hair data by automating the creation of the largest synthetic hair dataset to date, containing 40K hairstyles. Second, we leverage the synthetic hair dataset to learn an image-conditioned diffusion-transfomer model that generates accurate 3D strands from a single frontal image. By using a pretrained image backbone, our method generalizes to in-the-wild images despite being trained only on synthetic data. Our diffusion model predicts a scalp texture map in which any point in the map contains the latent code for an individual hair strand. These codes are directly decoded to 3D strands without post-processing techniques. Representing individual strands, instead of guide strands, enables the transformer to model the detailed spatial structure of complex hairstyles. With this, DiffLocks can recover highly curled hair, like afro hairstyles, from a single image for the first time. Data and code is available at https://radualexandru.github.io/difflocks/
Abstract（参考訳）: ヘアスタイルの多様性と3Dヘアデータのペア化が欠如していることから, 1枚の画像から3Dヘア形状を生成するという課題に対処する。従来の方法は、主に合成データに基づいて訓練され、ガイドストランドや頭皮レベルの埋め込みのような低次元の中間表現を用いて、デコード、アップサンプリング、リアリズムを追加するために後処理を必要とする、限られた量のデータを扱う。これらのアプローチは、詳細な毛髪の再構築、巻き毛との闘い、または数種類の毛髪に限られる。これらの制約を克服するために,1つの画像から直接,多様なヘアスタイルの詳細な再構築を可能にする新しいフレームワークDiffLocksを提案する。まず,40Kのヘアスタイルを含む最大合成ヘアデータセットの作成を自動化することで,3Dヘアデータの欠如に対処する。次に, 合成毛髪データセットを用いて, 単一の正面画像から正確な3次元ストランドを生成する画像条件拡散-遷移モデルについて学習する。予め訓練した画像バックボーンを用いることで,合成データのみを訓練しながら,画像の幅内への一般化を行う。我々の拡散モデルでは,各ヘアストランドの潜在コードを含む地図上の任意の点を含む頭皮テクスチャマップが予測される。これらのコードは、後処理技術なしで直接3Dストランドにデコードされる。ガイドストランドの代わりに個々のストランドを表現することで、トランスフォーマーは複雑なヘアスタイルの詳細な空間構造をモデル化することができる。これによってDiffLocksは、アフロのヘアスタイルのように、硬化した毛髪を初めて単一の画像から回収できる。データとコードはhttps://radualexandru.github.io/difflocks/で入手できる。

論文の概要: DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models

関連論文リスト