Fugu-MT 論文翻訳(概要): Bidirectional Learning of Facial Action Units and Expressions via Structured Semantic Mapping across Heterogeneous Datasets

論文の概要: Bidirectional Learning of Facial Action Units and Expressions via Structured Semantic Mapping across Heterogeneous Datasets

arxiv url: http://arxiv.org/abs/2604.10541v1
Date: Sun, 12 Apr 2026 09:08:32 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-14 20:13:16.086322
Title: Bidirectional Learning of Facial Action Units and Expressions via Structured Semantic Mapping across Heterogeneous Datasets
Title（参考訳）: 不均一データセット間の構造的意味マッピングによる顔行動単位と表情の双方向学習
Authors: Jia Li, Yu Zhang, Yin Chen, Zhenzhen Hu, Yong Li, Richang Hong, Shiguang Shan, Meng Wang,
Abstract要約: 本研究では,異なるデータ領域下での双方向AU-FE学習のための構造化セマンティックマッピング(SSM)フレームワークを提案する。 SSMは、(1)動的AUとFEビデオから統一された顔表現を学習する共有視覚バックボーン、(2)テキストセマンティックプロトタイプ(TSP)モジュールによるセマンティックメディエーション、(3)顔行動符号化システムから派生した事前知識を組み込んだ動的優先マッピング(DPM)モジュールの3つの主要な構成要素から構成される。
参考スコア（独自算出の注目度）: 85.74213192818668
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Facial action unit (AU) detection and facial expression (FE) recognition can be jointly viewed as affective facial behavior tasks, representing fine-grained muscular activations and coarse-grained holistic affective states, respectively. Despite their inherent semantic correlation, existing studies predominantly focus on knowledge transfer from AUs to FEs, while bidirectional learning remains insufficiently explored. In practice, this challenge is further compounded by heterogeneous data conditions, where AU and FE datasets differ in annotation paradigms (frame-level vs.\ clip-level), label granularity, and data availability and diversity, hindering effective joint learning. To address these issues, we propose a Structured Semantic Mapping (SSM) framework for bidirectional AU--FE learning under different data domains and heterogeneous supervision. SSM consists of three key components: (1) a shared visual backbone that learns unified facial representations from dynamic AU and FE videos; (2) semantic mediation via a Textual Semantic Prototype (TSP) module, which constructs structured semantic prototypes from fixed textual descriptions augmented with learnable context prompts, serving as supervision signals and cross-task alignment anchors in a shared semantic space; and (3) a Dynamic Prior Mapping (DPM) module that incorporates prior knowledge derived from the Facial Action Coding System and learns a data-driven association matrix in a high-level feature space, enabling explicit and bidirectional knowledge transfer. Extensive experiments on popular AU detection and FE recognition benchmarks show that SSM achieves state-of-the-art performance on both tasks simultaneously, and demonstrate that holistic expression semantics can in turn enhance fine-grained AU learning even across heterogeneous datasets.
Abstract（参考訳）: 顔行動単位(AU)検出と顔表情(FE)認識は、それぞれ、きめ細かい筋肉の活性化と粗い全身的な情動状態を表す、感情的な顔行動タスクとみなすことができる。それら固有の意味的相関にもかかわらず、既存の研究は主にAUからFEへの知識伝達に焦点を当てているが、双方向学習は十分に研究されていない。実際には、この課題は、AUとFEデータセットがアノテーションパラダイム(フレームレベル vs. フレームレベル)で異なる異種データ条件によってさらに複雑化されている。クリップレベル)、ラベルの粒度、データの可用性と多様性、効果的な共同学習を妨げる。これらの課題に対処するために、異なるデータドメインと異種監視下での双方向AU-FE学習のための構造化意味マッピング(SSM)フレームワークを提案する。 SSMは、動的なAUとFEビデオから統一的な顔表現を学習する共有視覚バックボーン、(2)学習可能なコンテキストプロンプトで拡張された固定されたテキスト記述から構造化されたセマンティックプロトタイプを構築するテキストセマンティックプロトタイプ(TSP)モジュール、(3)顔行動符号化システムから派生した事前知識を組み込んだ動的優先マッピング(DPM)モジュール、そして高レベルな特徴空間におけるデータ駆動関連行列を学習し、明示的および双方向の知識伝達を可能にする。一般的なAU検出とFE認識ベンチマークに関する大規模な実験により、SSMは両方のタスクで最先端のパフォーマンスを同時に達成し、全体論的表現セマンティクスが不均一なデータセットをまたいだ詳細なAU学習を向上することを示した。

論文の概要: Bidirectional Learning of Facial Action Units and Expressions via Structured Semantic Mapping across Heterogeneous Datasets

関連論文リスト