Fugu-MT 論文翻訳(概要): Disentangled Representation Learning via Modular Compositional Bias

論文の概要: Disentangled Representation Learning via Modular Compositional Bias

arxiv url: http://arxiv.org/abs/2510.21402v1
Date: Fri, 24 Oct 2025 12:46:19 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-28 09:00:15.46791
Title: Disentangled Representation Learning via Modular Compositional Bias
Title（参考訳）: モジュラー構成バイアスによる異方性表現学習
Authors: Whie Jung, Dong Hoon Lee, Seunghoon Hong,
Abstract要約: 目的とアーキテクチャの両方から分離されたモジュラー帰納バイアスである構成バイアスを提案する。我々の重要な洞察は、異なる要因がデータ分散において異なる組換え規則に従うことである。提案手法は属性とオブジェクトの絡み合いの両面での競合性能を示し,グローバルなスタイルとオブジェクトの絡み合いを一意に達成する。
参考スコア（独自算出の注目度）: 19.244228209387163
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent disentangled representation learning (DRL) methods heavily rely on factor specific strategies-either learning objectives for attributes or model architectures for objects-to embed inductive biases. Such divergent approaches result in significant overhead when novel factors of variation do not align with prior assumptions, such as statistical independence or spatial exclusivity, or when multiple factors coexist, as practitioners must redesign architectures or objectives. To address this, we propose a compositional bias, a modular inductive bias decoupled from both objectives and architectures. Our key insight is that different factors obey distinct recombination rules in the data distribution: global attributes are mutually exclusive, e.g., a face has one nose, while objects share a common support (any subset of objects can co-exist). We therefore randomly remix latents according to factor-specific rules, i.e., a mixing strategy, and force the encoder to discover whichever factor structure the mixing strategy reflects through two complementary objectives: (i) a prior loss that ensures every remix decodes into a realistic image, and (ii) the compositional consistency loss introduced by Wiedemer et al. (arXiv:2310.05327), which aligns each composite image with its corresponding composite latent. Under this general framework, simply adjusting the mixing strategy enables disentanglement of attributes, objects, and even both, without modifying the objectives or architectures. Extensive experiments demonstrate that our method shows competitive performance in both attribute and object disentanglement, and uniquely achieves joint disentanglement of global style and objects. Code is available at https://github.com/whieya/Compositional-DRL.
Abstract（参考訳）: 最近の不整合表現学習(DRL)手法は、属性の学習目標やオブジェクトのモデルアーキテクチャ、帰納的バイアスの埋め込みなど、要因固有の戦略に大きく依存している。このようなばらつきのアプローチは、統計的独立性や空間的排他性といった従来の仮定と一致しない場合や、実践者がアーキテクチャや目的を再設計する必要があるため、複数の要因が共存している場合など、大きなオーバーヘッドをもたらす。これを解決するために、目的とアーキテクチャの両方から分離されたモジュラー帰納バイアスである構成バイアスを提案する。グローバル属性は相互排他的であり、例えば、顔には1つの鼻があり、オブジェクトは共通のサポートを共有しています(オブジェクトのサブセットは共存可能です)。したがって、我々は、因子特異的な規則、すなわち混合戦略に従ってラテントをランダムにリミックスし、エンコーダに、混合戦略が2つの相補的な目的を通して反映する因子構造を発見させる。 (i)すべてのリミックス復号がリアルな画像に変換されることを保証する事前の損失 (II) Wiedemer et al (arXiv:2310.05327) が導入した組成整合性損失は, 合成画像と対応する合成潜水剤との整合性を示す。この一般的なフレームワークの下では、単にミキシング戦略を調整するだけで、目的やアーキテクチャを変更することなく、属性、オブジェクト、そして両方を歪めることができる。広汎な実験により,本手法は属性とオブジェクトの絡み合いの両面での競合性能を示し,グローバルなスタイルとオブジェクトの絡み合いを一意に達成することを示した。コードはhttps://github.com/whieya/compositional-DRLで公開されている。

論文の概要: Disentangled Representation Learning via Modular Compositional Bias

関連論文リスト