Fugu-MT 論文翻訳(概要): Omni-3DEdit: Generalized Versatile 3D Editing in One-Pass

論文の概要: Omni-3DEdit: Generalized Versatile 3D Editing in One-Pass

arxiv url: http://arxiv.org/abs/2603.17841v1
Date: Wed, 18 Mar 2026 15:32:41 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-19 18:32:57.798832
Title: Omni-3DEdit: Generalized Versatile 3D Editing in One-Pass
Title（参考訳）: Omni-3DEdit:ワンパスで多機能な3D編集が可能に
Authors: Chen Liyi, Wang Pengfei, Zhang Guowen, Ma Zhiyuan, Zhang Lei,
Abstract要約: 我々は,様々な3D編集タスクを暗黙的に一般化する学習ベースモデルであるOmni-3DEditを紹介する。学習ベースモデルとして、我々のモデルはオンライン最適化に時間を要することなく、様々な3D編集タスクを1回のフォワードパスで完了させることができる。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Most instruction-driven 3D editing methods rely on 2D models to guide the explicit and iterative optimization of 3D representations. This paradigm, however, suffers from two primary drawbacks. First, it lacks a universal design of different 3D editing tasks because the explicit manipulation of 3D geometry necessitates task-dependent rules, e.g., 3D appearance editing demands inherent source 3D geometry, while 3D removal alters source geometry. Second, the iterative optimization process is highly time-consuming, often requiring thousands of invocations of 2D/3D updating. We present Omni-3DEdit, a unified, learning-based model that generalizes various 3D editing tasks implicitly. One key challenge to achieve our goal is the scarcity of paired source-edited multi-view assets for training. To address this issue, we construct a data pipeline, synthesizing a relatively rich number of high-quality paired multi-view editing samples. Subsequently, we adapt the pre-trained generative model SEVA as our backbone by concatenating source view latents along with conditional tokens in sequence space. A dual-stream LoRA module is proposed to disentangle different view cues, largely enhancing our model's representational learning capability. As a learning-based model, our model is free of the time-consuming online optimization, and it can complete various 3D editing tasks in one forward pass, reducing the inference time from tens of minutes to approximately two minutes. Extensive experiments demonstrate the effectiveness and efficiency of Omni-3DEdit.
Abstract（参考訳）: ほとんどの命令駆動3D編集法は、3D表現の明示的で反復的な最適化を導くために2Dモデルに依存している。しかし、このパラダイムは2つの大きな欠点に悩まされている。ひとつは、3次元幾何学の明示的な操作がタスク依存のルールを必要とするため、3Dの外観編集は固有のソース3D幾何学を必要とするのに対し、3Dの除去はソース幾何学を変えるため、異なる3D編集タスクの普遍的な設計を欠いていることである。第二に、反復最適化プロセスは非常に時間がかかり、2D/3Dの更新を何千回も実行する必要がある。我々は,様々な3D編集タスクを暗黙的に一般化する統合学習ベースモデルであるOmni-3DEditを提案する。私たちの目標を達成する上で重要な課題のひとつは、トレーニング用のソース編集された複数ビューアセットの不足です。この問題に対処するため、我々は比較的多くの高品質なペア・マルチビュー編集サンプルを合成し、データパイプラインを構築した。その後、ソースビューラテントとシーケンス空間の条件付きトークンを連結することにより、事前学習された生成モデルSEVAをバックボーンとして適用する。両ストリームのLoRAモジュールは、異なるビューキューをアンタングルするために提案され、モデルの表現学習能力を大幅に向上させる。学習ベースモデルとして,我々のモデルはオンライン最適化に要しないため,各3D編集タスクを1回のフォワードパスで完了させることができ,推論時間を数十分から約2分に短縮することができる。大規模な実験はOmni-3DEditの有効性と効率を実証した。

論文の概要: Omni-3DEdit: Generalized Versatile 3D Editing in One-Pass

関連論文リスト