Fugu-MT 論文翻訳(概要): Real Eyes Realize Faster: Gaze Stability and Pupil Novelty for Efficient Egocentric Learning

論文の概要: Real Eyes Realize Faster: Gaze Stability and Pupil Novelty for Efficient Egocentric Learning

arxiv url: http://arxiv.org/abs/2603.04098v1
Date: Wed, 04 Mar 2026 14:10:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-05 21:29:15.339543
Title: Real Eyes Realize Faster: Gaze Stability and Pupil Novelty for Efficient Egocentric Learning
Title（参考訳）: 現実の目はより速く実現する:効率的な自我中心学習のための視線安定性と瞳孔の新規性
Authors: Ajan Subramanian, Sumukh Bettadapura, Rohan Sathish,
Abstract要約: 常に自我中心のカメラは、ロボティクス、模倣学習、補助ARのデモとして使われるようになっている。ウェアラブルデバイスのストレージとバッテリの制約の下では、どのフレームを保持するかは、それらから学ぶ方法と同じくらい重要である。この知見をDual-Criterion Frame Curatorとして運用し、まず視線品質でフレームをゲートし、その後、瞳孔由来のノベルティで生存者をランク付けする。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Always-on egocentric cameras are increasingly used as demonstrations for embodied robotics, imitation learning, and assistive AR, but the resulting video streams are dominated by redundant and low-quality frames. Under the storage and battery constraints of wearable devices, choosing which frames to keep is as important as how to learn from them. We observe that modern eye-tracking headsets provide a continuous, training-free side channel that decomposes into two complementary axes: gaze fixation captures visual stability (quality), while pupil response captures arousal-linked moments (novelty). We operationalize this insight as a Dual-Criterion Frame Curator that first gates frames by gaze quality and then ranks the survivors by pupil-derived novelty. On the Visual Experience Dataset (VEDB), curated frames at 10% budget match the classification performance of the full stream, and naive signal fusion consistently destroys both contributions. The benefit is task-dependent: pupil ranking improves activity recognition, while gaze-only selection already dominates for scene recognition, confirming that the two signals serve genuinely different roles. Our method requires no model inference and operates at capture time, offering a path toward efficient, always-on egocentric data curation.
Abstract（参考訳）: 常に自我中心のカメラは、ロボット工学、模倣学習、補助ARのデモとして使われることが多いが、ビデオストリームは冗長で低品質なフレームによって支配されている。ウェアラブルデバイスのストレージとバッテリの制約の下では、どのフレームを保持するかは、それらから学ぶ方法と同じくらい重要である。現代の視線追跡ヘッドセットは、2つの相補的な軸に分解される連続した訓練のないサイドチャネルを提供する: 視線固定は視覚的安定性(品質)を、瞳孔応答は覚醒的リンクモーメント(ノベルティ)を捉えている。この知見をDual-Criterion Frame Curatorとして運用し、まず視線品質でフレームをゲートし、その後、瞳孔由来のノベルティで生存者をランク付けする。 Visual Experience Dataset (VEDB)では、10%の予算でキュレートされたフレームがフルストリームの分類性能と一致し、シグナル融合が両コントリビューションを継続的に破壊する。生徒のランク付けは行動認識を改善するが、視線のみの選択はシーン認識に支配的であり、2つの信号が真に異なる役割を果たすことを確認する。我々の手法はモデル推論を必要とせず、キャプチャ時に動作し、効率的で常時オンのエゴセントリックなデータキュレーションへの道を提供する。

論文の概要: Real Eyes Realize Faster: Gaze Stability and Pupil Novelty for Efficient Egocentric Learning

関連論文リスト