Fugu-MT 論文翻訳(概要): Alfa: Attentive Low-Rank Filter Adaptation for Structure-Aware Cross-Domain Personalized Gaze Estimation

論文の概要: Alfa: Attentive Low-Rank Filter Adaptation for Structure-Aware Cross-Domain Personalized Gaze Estimation

arxiv url: http://arxiv.org/abs/2603.08445v1
Date: Mon, 09 Mar 2026 14:43:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:16.207787
Title: Alfa: Attentive Low-Rank Filter Adaptation for Structure-Aware Cross-Domain Personalized Gaze Estimation
Title（参考訳）: Alfa: 構造認識型クロスドメインパーソナライズ・ゲイズ推定のための注意型低ランクフィルタ適応
Authors: He-Yen Hsieh, Wei-Te Mark Ting, H. T. Kung,
Abstract要約: テストタイムのパーソナライゼーションは、未ラベルのサンプルのみを使用して、事前トレーニングされたモデルをユーザ固有のドメインシフトに適応させる。本稿では,事前学習したフィルタのセマンティックパターンを再重み付けすることで,視線モデルに適応するアテンテーティブ低ランクフィルタ適応(Alfa)を提案する。 Alfaは、4つのデータセットのベンチマークで平均視線誤差を最小に達成している。
参考スコア（独自算出の注目度）: 3.6210224971711487
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Pre-trained gaze models learn to identify useful patterns commonly found across users, but subtle user-specific variations (i.e., eyelid shape or facial structure) can degrade model performance. Test-time personalization (TTP) adapts pre-trained models to these user-specific domain shifts using only a few unlabeled samples. Efficient fine-tuning is critical in performing this domain adaptation: data and computation resources can be limited-especially for on-device customization. While popular parameter-efficient fine-tuning (PEFT) methods address adaptation costs by updating only a small set of weights, they may not be taking full advantage of structures encoded in pre-trained filters. To more effectively leverage existing structures learned during pre-training, we reframe personalization as a process to reweight existing features rather than learning entirely new ones. We present Attentive Low-Rank Filter Adaptation (Alfa) to adapt gaze models by reweighting semantic patterns in pre-trained filters. With Alfa, singular value decomposition (SVD) extracts dominant spatial components that capture eye and facial characteristics across users. Via an attention mechanism, we need only a few unlabeled samples to adjust and reweight pre-trained structures, selectively amplifying those relevant to a target user. Alfa achieves the lowest average gaze errors across four cross-dataset gaze benchmarks, outperforming existing TTP methods and low-rank adaptation (LoRA)-based variants. We also show that Alfa's attentive low-rank methods can be applied to applications beyond vision, such as diffusion-based language models.
Abstract（参考訳）: 事前学習された視線モデルは、ユーザ間で一般的に見られる有用なパターンを特定することを学ぶが、微妙なユーザ固有のバリエーション(眼球形状や顔の構造など)は、モデルの性能を低下させる可能性がある。テストタイムパーソナライズ(TTP)は、いくつかの未ラベルサンプルを使用して、これらのユーザ固有のドメインシフトに事前トレーニングされたモデルを適用する。データと計算リソースは、特にオンデバイスカスタマイズのために制限される可能性がある。広く使われているパラメータ効率の微調整(PEFT)手法は、少量の重みだけを更新することで適応コストに対処するが、事前訓練されたフィルタで符号化された構造を十分に活用していないかもしれない。事前トレーニング中に学んだ既存の構造をより効果的に活用するために、私たちはパーソナライゼーションを、まったく新しいものを学ぶのではなく、既存の機能を再重み付けするプロセスとして再構成しました。本稿では,事前学習したフィルタのセマンティックパターンを再重み付けすることで,視線モデルに適応するアテンテーティブ低ランクフィルタ適応(Alfa)を提案する。 Alfaでは、特異値分解(SVD)が、利用者の目や顔の特徴を捉えた支配的な空間成分を抽出する。注意機構として、事前訓練された構造を調整し、再重み付けし、ターゲットユーザに関連するものを選択的に増幅するために、ラベルのないサンプルを少しだけ必要とします。 Alfaは4つのベンチマークで平均視線誤差を最小にし、既存のTTP法とローランク適応(LoRA)ベースの変種を上回ります。また、拡散に基づく言語モデルのような視覚以外のアプリケーションにも、アルファの注意深い低ランク手法が適用可能であることを示す。

論文の概要: Alfa: Attentive Low-Rank Filter Adaptation for Structure-Aware Cross-Domain Personalized Gaze Estimation

関連論文リスト