Fugu-MT 論文翻訳(概要): Semi-Supervised Gaze Estimation via Disentangled Subspace Contrastive Learning

論文の概要: Semi-Supervised Gaze Estimation via Disentangled Subspace Contrastive Learning

arxiv url: http://arxiv.org/abs/2605.27080v1
Date: Tue, 26 May 2026 14:31:10 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-27 17:51:42.224966
Title: Semi-Supervised Gaze Estimation via Disentangled Subspace Contrastive Learning
Title（参考訳）: アンタングル付き部分空間コントラスト学習による半教師付き迷路推定
Authors: Qida Tan, Hongyu Yang, Wenchao Du,
Abstract要約: 出現に基づく視線推定は、注釈付きサンプルが限られ、データセットの多様性が不十分なため、常に一般化に苦しむ。先行するアプローチでは、制約のない現実世界のシナリオから大規模な擬似ラベル付きデータを生成するために、弱教師付き学習を採用する。我々は、ラベルのないデータを活用してドメインの一般化を促進する、シンプルで効果的な半教師付き学習アーキテクチャを考案する。
参考スコア（独自算出の注目度）: 20.422491630669885
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Appearance-based gaze estimation always suffers from poor generalization due to limited annotated samples and insufficient dataset diversity. Leading approaches adopt weakly supervised learning to generate large-scale pseudo-labeled data from unconstrained real-world scenarios, aiming to mitigate the domain shifts. In this work, we devise a simple yet effective semi-supervised learning architecture that leverages unlabeled data to enhance domain generalization, thereby reducing reliance on labor-intensive manual annotations. Our key insight is to impose Jacobian regularization to disentangle feature representations into discriminative subspaces dedicated to specific gaze components, such as pitch and yaw angles. We further exploit the intrinsic ordinal ranking within each subspace for contrastive learning, enabling the model to learn robust gaze representations from a small set of labeled samples and an abundance of unlabeled ones. This ultimately yields our Disentangled Subspace Contrastive Learning (DSCL) framework. Extensive experiments on multiple benchmarks verify that the proposed DSCL is plug-and-play, achieving competitive performance using only 20\%, 10\%, and even 5\% of the annotated data under both in-domain and cross-domain evaluation settings. The public code is available at \href{https://github.com/da60266/DSCL}{https://github.com/da60266/DSCL}.
Abstract（参考訳）: 出現に基づく視線推定は、注釈付きサンプルが限られ、データセットの多様性が不十分なため、常に一般化に苦しむ。先行するアプローチでは、弱い教師付き学習を採用して、制約のない現実のシナリオから大規模な擬似ラベル付きデータを生成し、ドメインシフトを緩和する。本研究では、ラベルのないデータを活用し、ドメインの一般化を強化し、労働集約的なマニュアルアノテーションへの依存を減らすための、シンプルで効果的な半教師付き学習アーキテクチャを考案する。私たちのキーとなる洞察は、ヤコビアン正規化(Jacobian regularization)によって特徴表現を、ピッチやヤウ角のような特定の視線成分に特化した識別的部分空間に分解することである。さらに,各部分空間内における内在的順序付けをコントラスト学習に活用し,ラベル付きサンプルの小さな集合とラベルなしサンプルの豊富な集合から頑健な視線表現を学習することを可能にする。これにより、最終的にDisentangled Subspace Contrastive Learning(DSCL)フレームワークが得られます。複数のベンチマークでの大規模な実験により、提案されたDSCLがプラグイン・アンド・プレイであることを検証し、ドメイン内およびクロスドメイン評価設定の両方で、アノテーション付きデータのわずか20%、10%、さらには5倍の競合性能を達成した。公開コードは \href{https://github.com/da60266/DSCL}{https://github.com/da60266/DSCL} で公開されている。

論文の概要: Semi-Supervised Gaze Estimation via Disentangled Subspace Contrastive Learning

関連論文リスト