Fugu-MT 論文翻訳(概要): Identifiable Multimodal Causal Representation Learning under Partial Latent Sharing

論文の概要: Identifiable Multimodal Causal Representation Learning under Partial Latent Sharing

arxiv url: http://arxiv.org/abs/2605.19135v1
Date: Mon, 18 May 2026 21:34:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-20 15:03:09.000731
Title: Identifiable Multimodal Causal Representation Learning under Partial Latent Sharing
Title（参考訳）: 部分的潜在共有下でのマルチモーダル因果表現学習
Authors: Manal Benhamza, Marianne Clausel, Myriam Tami,
Abstract要約: 因果表現学習は、観測データから有意義な潜伏変数とその対応する因果構造を明らかにする。 CRLにおける識別可能性を証明することは本質的に困難であり、この研究でさらに難しい設定であるマルチモーダリティに対処する。我々は、潜在部分共有構造を持つマルチモーダル観測データを考察する。各モダリティは、因果潜在変数の特定の部分集合から、非線型混合関数を介して生成される。さらに、我々の識別可能性の結果は、各モードについて、潜伏変数よりも多く観察されるような不完全なシナリオにも当てはまる。
参考スコア（独自算出の注目度）: 4.400865943835067
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Causal representation learning (CRL) seeks to uncover meaningful latent variables and their corresponding causal structure from high-dimensional observational data. Although its significance, CRL identifiability remains a crucial property, as it ensures the recovery of the mechanisms behind the data generation process, and hence the interpretability and robustness of the representation. Proving identifiability in CRL is intrinsically difficult, and we address in this work an even more challenging setting: multimodality. We consider multimodal observed data with a latent partially shared structure. Each modality is generated, through non linear mixing functions, from a specific subset of causal latent variables. Under flexible assumptions and without imposing any parametric distribution on the latent variables, we establish component-wise identifiability guarantees for the causal latent representation. Our identifiability results, furthermore, apply to the undercomplete scenario where we have, for each modality, more observed than latent variables. To instantiate our theoretical analysis, we introduce a Wasserstein-based module to recover the partially shared latent structure. Due to its differentiability, the latter can be easily integrated into all types of architecture, only requiring minimal changes. Extensive experiments on synthetic and realistic datasets validate the superiority of our approach over SOTA methods.
Abstract（参考訳）: 因果表現学習(CRL)は,高次元観測データから有意義な潜伏変数とその対応する因果構造を明らかにする。その重要性はあるものの、CRL識別性はデータ生成プロセスの背後にあるメカニズムの回復を確実にし、したがって表現の解釈可能性と堅牢性を保証するため、重要な特性である。 CRLにおける識別可能性を証明することは本質的に困難であり、この研究でさらに難しい設定であるマルチモーダリティに対処する。潜在部分共有構造を持つマルチモーダル観測データについて検討する。それぞれのモダリティは、因果潜在変数の特定の部分集合から非線型混合関数を通して生成される。フレキシブルな仮定の下では、潜伏変数にパラメトリック分布を課すことなく、因果潜伏表現に対する成分的識別可能性を保証する。さらに、我々の識別可能性の結果は、各モードについて、潜伏変数よりも多く観察されるような不完全なシナリオにも当てはまる。理論的解析をインスタンス化するために、部分的に共有された潜在構造を復元するワッサーシュタインに基づくモジュールを導入する。その差別化性のため、後者はすべてのタイプのアーキテクチャに簡単に統合でき、最小限の変更しか必要としない。合成および現実的なデータセットに関する大規模な実験は、SOTA法よりもアプローチの優位性を検証する。

論文の概要: Identifiable Multimodal Causal Representation Learning under Partial Latent Sharing

関連論文リスト