Fugu-MT 論文翻訳(概要): ObjChangeVR: Object State Change Reasoning from Continuous Egocentric Views in VR Environments

論文の概要: ObjChangeVR: Object State Change Reasoning from Continuous Egocentric Views in VR Environments

arxiv url: http://arxiv.org/abs/2603.06648v1
Date: Fri, 27 Feb 2026 19:29:02 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-15 16:38:22.478273
Title: ObjChangeVR: Object State Change Reasoning from Continuous Egocentric Views in VR Environments
Title（参考訳）: ObjChangeVR:VR環境における連続的エゴシックな視点からのオブジェクト状態変化の推論
Authors: Shiyi Ding, Shaoen Wu, Ying Chen,
Abstract要約: 仮想現実(VR)におけるオブジェクト状態変化に対する質問応答タスクのベンチマークについて紹介する。また、視点認識と時間ベース検索を組み合わせたフレームワークであるChangeVRと、クロスビュー推論を提案する。
参考スコア（独自算出の注目度）: 4.46498673219845
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in multimodal large language models (MLLMs) offer a promising approach for natural language-based scene change queries in virtual reality (VR). Prior work on applying MLLMs for object state understanding has focused on egocentric videos that capture the camera wearer's interactions with objects. However, object state changes may occur in the background without direct user interaction, lacking explicit motion cues and making them difficult to detect. Moreover, no benchmark exists for evaluating this challenging scenario. To address these challenges, we introduce ObjChangeVR-Dataset, specifically for benchmarking the question-answering task of object state change. We also propose ObjChangeVR, a framework that combines viewpoint-aware and temporal-based retrieval to identify relevant frames, along with cross-view reasoning that reconciles inconsistent evidence from multiple viewpoints. Extensive experiments demonstrate that ObjChangeVR significantly outperforms baseline approaches across multiple MLLMs.
Abstract（参考訳）: マルチモーダル大規模言語モデル(MLLM)の最近の進歩は、仮想現実(VR)における自然言語ベースのシーン変化クエリーに有望なアプローチを提供する。 MLLMをオブジェクト状態理解に適用する以前の研究は、カメラ装着者のオブジェクトとのインタラクションをキャプチャするエゴセントリックなビデオに焦点を当てていた。しかし、オブジェクトの状態変化は、直接のユーザインタラクションなしにバックグラウンドで発生し、明示的なモーションキューが欠如し、検出が困難になる可能性がある。さらに、この挑戦的なシナリオを評価するためのベンチマークは存在しない。これらの課題に対処するために、ObjChangeVR-Datasetを紹介します。また、視点認識と時間に基づく検索を組み合わせて関連するフレームを識別するフレームワークであるObjChangeVRと、複数の視点から矛盾する証拠を照合するクロスビュー推論を提案する。大規模な実験により、ObjChangeVRは複数のMLLMのベースラインアプローチよりも大幅に優れていた。

論文の概要: ObjChangeVR: Object State Change Reasoning from Continuous Egocentric Views in VR Environments

関連論文リスト