Fugu-MT 論文翻訳(概要): ScribbleSense: Generative Scribble-Based Texture Editing with Intent Prediction

論文の概要: ScribbleSense: Generative Scribble-Based Texture Editing with Intent Prediction

arxiv url: http://arxiv.org/abs/2601.22455v1
Date: Fri, 30 Jan 2026 01:55:44 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-02 18:28:15.158825
Title: ScribbleSense: Generative Scribble-Based Texture Editing with Intent Prediction
Title（参考訳）: ScribbleSense: インテント予測による生成スクリブルベースのテクスチャ編集
Authors: Yudi Zhang, Yeming Geng, Lei Zhang,
Abstract要約: ScribbleSenseは、マルチモーダル大言語モデル(MLLM)と画像生成モデルを組み合わせた編集方法である。我々はMLLMの視覚的能力を活用し、スクリブルの背後にある編集意図を予測する。局所的なテクスチャの詳細を抽出するために,グローバルに生成された画像を用いる。
参考スコア（独自算出の注目度）: 5.109590115201006
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Interactive 3D model texture editing presents enhanced opportunities for creating 3D assets, with freehand drawing style offering the most intuitive experience. However, existing methods primarily support sketch-based interactions for outlining, while the utilization of coarse-grained scribble-based interaction remains limited. Furthermore, current methodologies often encounter challenges due to the abstract nature of scribble instructions, which can result in ambiguous editing intentions and unclear target semantic locations. To address these issues, we propose ScribbleSense, an editing method that combines multimodal large language models (MLLMs) and image generation models to effectively resolve these challenges. We leverage the visual capabilities of MLLMs to predict the editing intent behind the scribbles. Once the semantic intent of the scribble is discerned, we employ globally generated images to extract local texture details, thereby anchoring local semantics and alleviating ambiguities concerning the target semantic locations. Experimental results indicate that our method effectively leverages the strengths of MLLMs, achieving state-of-the-art interactive editing performance for scribble-based texture editing.
Abstract（参考訳）: インタラクティブな3Dモデルテクスチャ編集は、3Dアセットを作成する機会を高め、フリーハンドドローイングスタイルは最も直感的な体験を提供する。しかし、既存の手法は主にアウトライン化のためのスケッチベースインタラクションをサポートし、粗粒度スクリブルベースのインタラクションの利用は限定的である。さらに、現在の方法論は、スクリブル命令の抽象的な性質により、曖昧な編集意図や、ターゲットのセマンティックな位置が不明確になる可能性があるため、しばしば課題に直面する。これらの課題に対処するために,マルチモーダル大言語モデル(MLLM)と画像生成モデルを組み合わせた編集手法であるScribbleSenseを提案する。我々はMLLMの視覚的能力を活用し、スクリブルの背後にある編集意図を予測する。スクリブルのセマンティックインテントが認識されると、グローバルに生成された画像を用いて局所的なテクスチャの詳細を抽出し、局所的なセマンティクスを固定し、ターゲットのセマンティクス位置に関する曖昧さを緩和する。実験結果から,本手法はMLLMの強度を有効活用し,スクリブルベースのテクスチャ編集のための対話的編集性能を実現することが示唆された。

論文の概要: ScribbleSense: Generative Scribble-Based Texture Editing with Intent Prediction

関連論文リスト