Fugu-MT 論文翻訳(概要): Disentangling Content from Style to Overcome Shortcut Learning: A Hybrid Generative-Discriminative Learning Framework

論文の概要: Disentangling Content from Style to Overcome Shortcut Learning: A Hybrid Generative-Discriminative Learning Framework

arxiv url: http://arxiv.org/abs/2509.11598v1
Date: Mon, 15 Sep 2025 05:28:32 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-16 17:26:23.16029
Title: Disentangling Content from Style to Overcome Shortcut Learning: A Hybrid Generative-Discriminative Learning Framework
Title（参考訳）: コンテンツをスタイルからオーバーカムショートカット学習に遠ざける:ハイブリッドな生成-識別型学習フレームワーク
Authors: Siming Fu, Sijun Dong, Xiaoliang Meng,
Abstract要約: ショートカット学習は、本質的な構造の代わりにテクスチャのような表面的特徴を利用する。本稿では,コンテンツスタイルの乱れを明示するハイブリッドフレームワークHyGDLを提案する。我々は,スタイルを,ベクトル射影によって導出されるスタイル不変内容を表す表現の構成要素として分析的に定義する。
参考スコア（独自算出の注目度）: 4.7403081236484335
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the remarkable success of Self-Supervised Learning (SSL), its generalization is fundamentally hindered by Shortcut Learning, where models exploit superficial features like texture instead of intrinsic structure. We experimentally verify this flaw within the generative paradigm (e.g., MAE) and argue it is a systemic issue also affecting discriminative methods, identifying it as the root cause of their failure on unseen domains. While existing methods often tackle this at a surface level by aligning or separating domain-specific features, they fail to alter the underlying learning mechanism that fosters shortcut dependency. To address this at its core, we propose HyGDL (Hybrid Generative-Discriminative Learning Framework), a hybrid framework that achieves explicit content-style disentanglement. Our approach is guided by the Invariance Pre-training Principle: forcing a model to learn an invariant essence by systematically varying a bias (e.g., style) at the input while keeping the supervision signal constant. HyGDL operates on a single encoder and analytically defines style as the component of a representation that is orthogonal to its style-invariant content, derived via vector projection.
Abstract（参考訳）: 自己監督学習(SSL)の顕著な成功にもかかわらず、その一般化は基本的にはショートカット学習によって妨げられ、モデルは本質的な構造ではなくテクスチャのような表面的特徴を利用する。生成パラダイム(例えば、MAE)におけるこの欠陥を実験的に検証し、差別的手法にも影響を及ぼす体系的な問題であり、未確認領域における障害の根本原因とみなす。既存のメソッドはドメイン固有の機能を整列したり分離したりすることで、表面レベルでこの問題に取り組むことが多いが、ショートカットの依存関係を育む基盤となる学習メカニズムを変更することに失敗している。そこで我々はHybrid Generative-Discriminative Learning Framework(Hybrid Generative-Discriminative Learning Framework)を提案する。我々のアプローチは、不変事前学習原則(Invariance Pre-training Principle)によって導かれる: モデルは、監督信号の定数を維持しながら、入力におけるバイアス(例えばスタイル)を体系的に変化させることで、不変性を学ぶことを強制する。 HyGDLは単一のエンコーダ上で動作し、ベクトル射影によって導出されるスタイル不変内容に直交する表現のコンポーネントとしてスタイルを解析的に定義する。

論文の概要: Disentangling Content from Style to Overcome Shortcut Learning: A Hybrid Generative-Discriminative Learning Framework

関連論文リスト