Fugu-MT 論文翻訳(概要): Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study

論文の概要: Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study

arxiv url: http://arxiv.org/abs/2605.06643v1
Date: Thu, 07 May 2026 17:51:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-08 22:27:12.065902
Title: Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study
Title（参考訳）: マルチモーダルドメイン一般化の進展か? : 総合的なベンチマーク研究
Authors: Hao Dong, Hongzhao Li, Shupan Li, Muhammad Haris Khan, Eleni Chatzi, Olga Fink,
Abstract要約: MMDG-Benchは、Multimodal Domain Generalizationの最初の統一的で包括的なベンチマークである。 MMDG-Benchは、3つの多様なタスクにまたがる6つのデータセットの評価を標準化する。汚職の堅牢性、欠落モダリティの一般化、誤分類検出、アウト・オブ・ディストリビューション検出を評価する。
参考スコア（独自算出の注目度）: 36.264692761556596
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the growing popularity of Multimodal Domain Generalization (MMDG) for enhancing model robustness, it remains unclear whether reported performance gains reflect genuine algorithmic progress or are artifacts of inconsistent evaluation protocols. Current research is fragmented, with studies varying significantly across datasets, modality configurations, and experimental settings. Furthermore, existing benchmarks focus predominantly on action recognition, often neglecting critical real-world challenges such as input corruptions, missing modalities, and model trustworthiness. This lack of standardization obscures a reliable assessment of the field's advancement. To address this issue, we introduce MMDG-Bench, the first unified and comprehensive benchmark for MMDG, which standardizes evaluation across six datasets spanning three diverse tasks: action recognition, mechanical fault diagnosis, and sentiment analysis. MMDG-Bench encompasses six modality combinations, nine representative methods, and multiple evaluation settings. Beyond standard accuracy, it systematically assesses corruption robustness, missing-modality generalization, misclassification detection, and out-of-distribution detection. With 7, 402 neural networks trained in total across 95 unique cross-domain tasks, MMDG-Bench yields five key findings: (1) under fair comparisons, recent specialized MMDG methods offer only marginal improvements over ERM baseline; (2) no single method consistently outperforms others across datasets or modality combinations; (3) a substantial gap to upper-bound performance persists, indicating that MMDG remains far from solved; (4) trimodal fusion does not consistently outperform the strongest bimodal configurations; and (5) all evaluated methods exhibit significant degradation under corruption and missing-modality scenarios, with some methods further compromising model trustworthiness.
Abstract（参考訳）: モデルロバスト性を高めるためのMMDG(Multimodal Domain Generalization)の人気が高まっているにもかかわらず、報告された性能向上が真のアルゴリズムの進歩を反映しているか、あるいは一貫性のない評価プロトコルの成果であるのかは不明だ。現在の研究は断片化されており、データセット、モダリティ設定、実験的な設定で大きく異なる。さらに、既存のベンチマークは主にアクション認識に重点を置いており、入力の汚職、モダリティの欠如、モデルの信頼性といった重要な現実世界の課題を無視している。この標準化の欠如は、この分野の進歩に関する信頼性の高い評価を曖昧にしている。この問題に対処するため、MDDG-BenchはMDDGの最初の統一的で包括的なベンチマークであり、アクション認識、機械的故障診断、感情分析の3つのタスクにまたがる6つのデータセットで評価を標準化する。 MMDG-Benchは6つのモードの組み合わせ、9つの代表的メソッド、複数の評価設定を含んでいる。標準的な精度を超えて、汚職の堅牢性、モダリティの一般化の欠如、誤分類検出、アウト・オブ・ディストリビューション検出を体系的に評価する。 MMDG-Benchは、95の異なるクロスドメインタスクで合計で7,402のニューラルネットワークをトレーニングし、(1)公正な比較では、最近のMDDGメソッドはERMベースラインよりも限界的な改善しか提供せず、(2)データセットやモダリティの組み合わせで一貫して他よりも優れた1つの手法は存在しない。

論文の概要: Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study

関連論文リスト