Fugu-MT 論文翻訳(概要): SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

論文の概要: SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

arxiv url: http://arxiv.org/abs/2604.14144v1
Date: Wed, 15 Apr 2026 17:59:12 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-16 20:38:32.672918
Title: SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments
Title（参考訳）: SpaceEvo: 決定論的幾何学的環境による自己進化型空間知能
Authors: Dinging Li, Yingxiu Zhao, Xinrui Cheng, Kangheng Lin, Hongbo Peng, Hongxing Li, Zixuan Wang, Yuhong Dai, Haodong Li, Jia Wang, Yukang Shi, Liang Zhao, Jianjian Sun, Zheng Ge, Xiangyu Zhang, Weiming Lu, Jun Xiao, Yueting Zhuang, Yongliang Shen,
Abstract要約: SpaceEvoは3D空間推論のための自己進化型フレームワークである。 16の空間推論タスクカテゴリを明示的な幾何学的検証規則で定式化する。注釈のない3Dシーンをゼロノイズのインタラクティブなオラクルに変換し、モデルコンセンサスを客観的な物理的フィードバックに置き換える。
参考スコア（独自算出の注目度）: 75.60795462502949
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spatial reasoning over three-dimensional scenes is a core capability for embodied intelligence, yet continuous model improvement remains bottlenecked by the cost of geometric annotation. The self-evolving paradigm offers a promising path, but its reliance on model consensus to construct pseudo-labels causes training to reinforce rather than correct the model's own geometric errors. We identify a property unique to 3D spatial reasoning that circumvents this limitation: ground truth is a deterministic consequence of the underlying geometry, computable exactly from point clouds and camera poses without any model involvement. Building on this insight, we present SpatialEvo, a self-evolving framework for 3D spatial reasoning, centered on the Deterministic Geometric Environment (DGE). The DGE formalizes 16 spatial reasoning task categories under explicit geometric validation rules and converts unannotated 3D scenes into zero-noise interactive oracles, replacing model consensus with objective physical feedback. A single shared-parameter policy co-evolves across questioner and solver roles under DGE constraints: the questioner generates physically valid spatial questions grounded in scene observations, while the solver derives precise answers against DGE-verified ground truth. A task-adaptive scheduler endogenously concentrates training on the model's weakest categories, producing a dynamic curriculum without manual design. Experiments across nine benchmarks demonstrate that SpatialEvo achieves the highest average score at both 3B and 7B scales, with consistent gains on spatial reasoning benchmarks and no degradation on general visual understanding.
Abstract（参考訳）: 三次元シーンに対する空間的推論はインテリジェンスを具現化するためのコア機能であるが、幾何的アノテーションのコストによって連続的なモデル改善がボトルネックになっている。自己進化パラダイムは有望な経路を提供するが、擬似ラベルを構築するためのモデルコンセンサスに依存しているため、モデルの幾何学的誤りを訂正するのではなく、トレーニングが強化される。基底的真理は下層の幾何学の決定論的結果であり、点雲とカメラのポーズから正確に計算可能であり、モデルに一切関わらない。この知見に基づいて、決定論的幾何学環境(DGE)を中心に、3次元空間推論のための自己進化フレームワークであるSpatialEvoを提示する。 DGEは、明示的な幾何学的検証規則の下で16の空間推論タスクカテゴリを定式化し、注釈のない3Dシーンをゼロノイズの対話型オラクルに変換し、モデルコンセンサスを客観的な物理的フィードバックに置き換える。一つの共有パラメータポリシーは、DGEの制約の下で質問者や解決者の役割を共進化させる: 質問者は、シーンの観察に基づく物理的に有効な空間的質問を生成し、解法はDGEの検証された地上真実に対する正確な答えを導出する。タスク適応型スケジューラは、モデルの最弱カテゴリのトレーニングを不均一に集中させ、手動設計なしで動的カリキュラムを生成する。 9つのベンチマークでの実験では、SpatialEvoは3Bと7Bの両方で最高平均スコアを達成しており、空間推論ベンチマークでは一貫した利得があり、一般的な視覚的理解では劣化しない。

論文の概要: SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

関連論文リスト