Fugu-MT 論文翻訳(概要): TAPESTRY: From Geometry to Appearance via Consistent Turntable Videos

論文の概要: TAPESTRY: From Geometry to Appearance via Consistent Turntable Videos

arxiv url: http://arxiv.org/abs/2603.17735v1
Date: Wed, 18 Mar 2026 14:02:09 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-19 18:32:57.73336
Title: TAPESTRY: From Geometry to Appearance via Consistent Turntable Videos
Title（参考訳）: TAPESTRY:一貫したターンテーブルビデオによる幾何学から外観へ
Authors: Yan Zeng, Haoran Jiang, Kaixin Yao, Qixuan Zhang, Longwen Zhang, Lan Xu, Jingyi Yu,
Abstract要約: 明示的な3次元幾何学に基づく高忠実度TTVを生成するためのフレームワークであるTAPESTRYを紹介する。また,3D-Aware Inpaintingを用いた多段パイプラインを特徴とする,TTV入力からの下流再構成タスクを設計する。その結果,本手法はビデオの整合性と最終的な再現性の両方において,既存の手法よりも優れていた。
参考スコア（独自算出の注目度）: 65.99602532894241
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automatically generating photorealistic and self-consistent appearances for untextured 3D models is a critical challenge in digital content creation. The advancement of large-scale video generation models offers a natural approach: directly synthesizing 360-degree turntable videos (TTVs), which can serve not only as high-quality dynamic previews but also as an intermediate representation to drive texture synthesis and neural rendering. However, existing general-purpose video diffusion models struggle to maintain strict geometric consistency and appearance stability across the full range of views, making their outputs ill-suited for high-quality 3D reconstruction. To this end, we introduce TAPESTRY, a framework for generating high-fidelity TTVs conditioned on explicit 3D geometry. We reframe the 3D appearance generation task as a geometry-conditioned video diffusion problem: given a 3D mesh, we first render and encode multi-modal geometric features to constrain the video generation process with pixel-level precision, thereby enabling the creation of high-quality and consistent TTVs. Building upon this, we also design a method for downstream reconstruction tasks from the TTV input, featuring a multi-stage pipeline with 3D-Aware Inpainting. By rotating the model and performing a context-aware secondary generation, this pipeline effectively completes self-occluded regions to achieve full surface coverage. The videos generated by TAPESTRY are not only high-quality dynamic previews but also serve as a reliable, 3D-aware intermediate representation that can be seamlessly back-projected into UV textures or used to supervise neural rendering methods like 3DGS. This enables the automated creation of production-ready, complete 3D assets from untextured meshes. Experimental results demonstrate that our method outperforms existing approaches in both video consistency and final reconstruction quality.
Abstract（参考訳）: 非テクスチャ付き3Dモデルのための写真リアリスティックおよび自己一貫性の外観の自動生成は、デジタルコンテンツ作成において重要な課題である。大規模なビデオ生成モデルの進歩は自然なアプローチを提供する。360度回転テーブルビデオ(TTV)を直接合成することで、高品質な動的プレビューだけでなく、テクスチャ合成とニューラルレンダリングを駆動する中間表現としても機能する。しかし、既存の汎用ビデオ拡散モデルは、全視野にわたって厳密な幾何整合性と外観安定性を維持するのに苦慮しており、高品質な3D再構成には不適である。この目的のために, 明示的な3次元幾何学に基づく高忠実度TTVを生成するためのフレームワークであるTAPESTRYを紹介する。 3Dメッシュが与えられたとき、まずマルチモーダルな幾何学的特徴をレンダリングしてエンコードし、画素レベルの精度で映像生成プロセスを制限し、高品質で一貫したTTVの作成を可能にする。また,TTV入力から下流への再構成作業を行う手法を設計し,多段パイプラインと3D-Aware Inpaintingを特徴とする。このパイプラインは、モデルを回転させ、コンテキスト対応の二次生成を行うことで、自己閉鎖領域を効果的に完了し、完全な表面被覆を実現する。 TAPESTRYが生成したビデオは高品質のダイナミックプレビューであるだけでなく、信頼性の高い3D対応の中間表現としても機能し、紫外線テクスチャにシームレスにバックプロジェクターしたり、3DGSのようなニューラルレンダリング手法を監督したりすることができる。これにより、無テクスチャメッシュからプロダクション対応の完全な3Dアセットを自動生成することが可能になる。実験結果から,本手法はビデオの整合性と最終的な再現性の両方において,既存の手法よりも優れていることが示された。

論文の概要: TAPESTRY: From Geometry to Appearance via Consistent Turntable Videos

関連論文リスト