Fugu-MT 論文翻訳(概要): Deployment-Oriented Session-wise Meta-Calibration for Landmark-Based Webcam Gaze Tracking

論文の概要: Deployment-Oriented Session-wise Meta-Calibration for Landmark-Based Webcam Gaze Tracking

arxiv url: http://arxiv.org/abs/2603.12388v1
Date: Thu, 12 Mar 2026 19:07:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-16 17:38:11.734254
Title: Deployment-Oriented Session-wise Meta-Calibration for Landmark-Based Webcam Gaze Tracking
Title（参考訳）: ランドマークに基づくWebカムゲーミング追跡のための展開指向セッションワイドメタキャリブレーション
Authors: Chenkai Zhang,
Abstract要約: Equivariant Meta-Calibrated Gaze (EMC-Gaze) E(3)-同変のランドマークグラフエンコーダ、局所眼形状、双眼強調、補助的な3D視線方向監督、およびエピソディックなメタトレーニングによって区別されたクローズドフォームリッジキャリブレータを組み合わせた軽量なランドマークのみの手法である。 MPIIFaceGazeでは、短いパーセッションキャリブレーションで、アイフォーカスモデルは16ショットキャリブレーションで8.82 +/- 1.21デグに達し、弾力ネットを1ショットで結び、3ショット以上で性能を上回っている。
参考スコア（独自算出の注目度）: 7.900882226705444
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Practical webcam gaze tracking is constrained not only by error, but also by calibration burden, robustness to head motion and session drift, runtime footprint, and browser use. We therefore target a deployment-oriented operating point rather than the image large-backbone regime. We cast landmark-based point-of-regard estimation as session-wise adaptation: a shared geometric encoder produces embeddings that can be aligned to a new session from a small calibration set. We present Equivariant Meta-Calibrated Gaze (EMC-Gaze), a lightweight landmark-only method combining an E(3)-equivariant landmark-graph encoder, local eye geometry, binocular emphasis, auxiliary 3D gaze-direction supervision, and a closed-form ridge calibrator differentiated through episodic meta-training. To reduce pose leakage, we use a two-view canonicalization consistency loss. The deployed predictor uses only facial landmarks and fits a per-session ridge head from brief calibration. In a fixation-style interactive evaluation over 33 sessions at 100 cm, EMC-Gaze achieves 5.79 +/- 1.81 deg RMSE after 9-point calibration versus 6.68 +/- 2.34 deg for Elastic Net; the gain is larger on still-head queries (2.92 +/- 0.75 deg vs. 4.45 +/- 0.30 deg). Across three subject holdouts of 10 subjects each, EMC-Gaze retains an advantage (5.66 +/- 0.19 deg vs. 6.49 +/- 0.33 deg). On MPIIFaceGaze with short per-session calibration, the eye-focused model reaches 8.82 +/- 1.21 deg at 16-shot calibration, ties Elastic Net at 1-shot, and outperforms it from 3-shot onward. The exported eye-focused encoder has 944,423 parameters, is 4.76 MB in ONNX, and supports calibrated browser prediction in 12.58/12.58/12.90 ms per sample (mean/median/p90) in Chromium 145 with ONNX Runtime Web. These results position EMC-Gaze as a calibration-friendly operating point rather than a universal state-of-the-art claim against heavier appearance-based systems.
Abstract（参考訳）: 実際のWebカメラの視線追跡は、エラーだけでなく、キャリブレーションの負担、ヘッドモーションやセッションのドリフトに対する堅牢性、ランタイムフットプリント、ブラウザの使用によって制限されている。したがって、画像の大きなバックボーン構造ではなく、デプロイメント指向の運用ポイントをターゲットとします。共有幾何エンコーダは、小さなキャリブレーションセットから新しいセッションに合わせることができる埋め込みを生成する。本稿では,E3-equivariantなランドマークグラフエンコーダ,局所眼形状,双眼強調,補助的な3次元視線方向監督,およびエピソードなメタトレーニングによって区別された閉形リッジキャリブレータを組み合わせた,軽量なランドマーク専用ガゼ(EMC-Gaze)を提案する。ポーズリークを低減するために、2ビューの正準化整合損失を用いる。展開された予測器は顔のランドマークのみを使用し、短時間のキャリブレーションからセッションごとのリッジヘッドに適合する。 EMC-Gazeは100cmで33セッションをインタラクティブに評価し、9ポイントのキャリブレーションで5.79 +/- 1.81 deg RMSE、Elastic Netで6.68 +/- 2.34 degを達成した(2.92 +/- 0.75 deg vs. 4.45 +/- 0.30 deg)。 EMC-Gazeは、被験者10名のうち3名(5.66 +/- 0.19 deg vs. 6.49 +/- 0.33 deg)で優位を維持している。 MPIIFaceGazeでは、短いパーセッションキャリブレーションで、アイフォーカスモデルは16ショットキャリブレーションで8.82 +/- 1.21デグに達し、弾力ネットを1ショットで結び、3ショット以上で性能を上回っている。輸出されたアイフォーカスエンコーダは994,423のパラメータを持ち、ONNXで4.76MBであり、Chromium 145の12.58/12.58/12.90ms/サンプル(平均/中間/p90)とONNX Runtime Webでキャリブレーションされたブラウザ予測をサポートする。これらの結果は、EMC-Gazeを、より重い外観ベースのシステムに対する普遍的な最先端のクレームではなく、キャリブレーションフレンドリーな運用ポイントとして位置づけている。

論文の概要: Deployment-Oriented Session-wise Meta-Calibration for Landmark-Based Webcam Gaze Tracking

関連論文リスト