Fugu-MT 論文翻訳(概要): LEXIS: LatEnt ProXimal Interaction Signatures for 3D HOI from an Image

論文の概要: LEXIS: LatEnt ProXimal Interaction Signatures for 3D HOI from an Image

arxiv url: http://arxiv.org/abs/2604.20800v1
Date: Wed, 22 Apr 2026 17:27:13 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-23 15:36:11.258182
Title: LEXIS: LatEnt ProXimal Interaction Signatures for 3D HOI from an Image
Title（参考訳）: LEXIS: 画像から3D HoIのラトエントプロキシマル相互作用シグナチャ
Authors: Dimitrije Antić, Alvaro Budria, George Paschalidis, Sai Kumar Dwivedi, Dimitrios Tzionas,
Abstract要約: RGB画像からの3Dヒューマン・オブジェクト・インタラクションの再構築は知覚システムに不可欠である。この制限をInterFields(密集した連続的な近接を符号化する表現)を介して解決する。 LEXISシグネチャを利用して人間とオブジェクトのメッシュを推定する拡散フレームワークであるLEXIS-Flowを開発した。
参考スコア（独自算出の注目度）: 11.119389060991532
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reconstructing 3D Human-Object Interaction from an RGB image is essential for perceptive systems. Yet, this remains challenging as it requires capturing the subtle physical coupling between the body and objects. While current methods rely on sparse, binary contact cues, these fail to model the continuous proximity and dense spatial relationships that characterize natural interactions. We address this limitation via InterFields, a representation that encodes dense, continuous proximity across the entire body and object surfaces. However, inferring these fields from single images is inherently ill-posed. To tackle this, our intuition is that interaction patterns are characteristically structured by the action and object geometry. We capture this structure in LEXIS, a novel discrete manifold of interaction signatures learned via a VQ-VAE. We then develop LEXIS-Flow, a diffusion framework that leverages LEXIS signatures to estimate human and object meshes alongside their InterFields. Notably, these InterFields help in a guided refinement that ensures physically-plausible, proximity-aware reconstructions without requiring post-hoc optimization. Evaluation on Open3DHOI and BEHAVE shows that LEXIS-Flow significantly outperforms existing SotA baselines in reconstruction, contact, and proximity quality. Our approach not only improves generalization but also yields reconstructions perceived as more realistic, moving us closer to holistic 3D scene understanding. Code & models will be public at https://anticdimi.github.io/lexis.
Abstract（参考訳）: RGB画像からの3Dヒューマン・オブジェクト・インタラクションの再構築は知覚システムに不可欠である。しかし、身体と物体の間の微妙な物理的結合を捉える必要があるため、これは依然として困難である。現在の手法は疎結合な接触手段に依存しているが、これらは自然な相互作用を特徴づける連続した近接関係と密接な空間関係をモデル化することができない。この制限をInterFieldsを通じて解決する。この表現は、体全体と物体の表面に密接な連続的な近接をエンコードする。しかし、これらのフィールドを単一の画像から推測することは本質的に不適切である。これを解決するために、我々の直感は、相互作用パターンがアクションとオブジェクトの幾何学によって特徴的に構造化されていることである。この構造を,VQ-VAEを用いて学習した相互作用シグネチャの離散多様体であるLEXISで捉える。次に、LEXISシグネチャを利用した拡散フレームワークであるLEXIS-Flowを開発し、InterFieldと並行して人間とオブジェクトのメッシュを推定する。特に、これらのInterFieldは、ポストホックの最適化を必要とせずに、物理的に証明可能な、近接認識の再構築を可能にするガイド付きリファインメントに役立ちます。 Open3DHOIとBEHAVEの評価では、レキシスフローは再建、接触、近接品質において既存のSotAベースラインを大きく上回っている。我々のアプローチは一般化を改善するだけでなく、より現実的と見なされる再構築をもたらし、総合的な3Dシーン理解に近づきつつある。コードとモデルはhttps://anticdimi.github.io/lexis.comで公開される。

論文の概要: LEXIS: LatEnt ProXimal Interaction Signatures for 3D HOI from an Image

関連論文リスト