Fugu-MT 論文翻訳(概要): On Modality Bias Recognition and Reduction

論文の概要: On Modality Bias Recognition and Reduction

arxiv url: http://arxiv.org/abs/2202.12690v1
Date: Fri, 25 Feb 2022 13:47:09 GMT
ステータス: 翻訳完了
システム内更新日: 2022-02-28 15:24:34.738638
Title: On Modality Bias Recognition and Reduction
Title（参考訳）: モダリティバイアス認識と低減について
Authors: Yangyang Guo, Liqiang Nie, Harry Cheng, Zhiyong Cheng, Mohan Kankanhalli, Alberto Del Bimbo
Abstract要約: マルチモーダル分類の文脈におけるモダリティバイアス問題について検討する。本稿では,各ラベルの特徴空間を適応的に学習するプラグアンドプレイ損失関数法を提案する。本手法は, ベースラインに比べ, 顕著な性能向上を実現している。
参考スコア（独自算出の注目度）: 70.69194431713825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Making each modality in multi-modal data contribute is of vital importance to learning a versatile multi-modal model. Existing methods, however, are often dominated by one or few of modalities during model training, resulting in sub-optimal performance. In this paper, we refer to this problem as modality bias and attempt to study it in the context of multi-modal classification systematically and comprehensively. After stepping into several empirical analysis, we recognize that one modality affects the model prediction more just because this modality has a spurious correlation with instance labels. In order to primarily facilitate the evaluation on the modality bias problem, we construct two datasets respectively for the colored digit recognition and video action recognition tasks in line with the Out-of-Distribution (OoD) protocol. Collaborating with the benchmarks in the visual question answering task, we empirically justify the performance degradation of the existing methods on these OoD datasets, which serves as evidence to justify the modality bias learning. In addition, to overcome this problem, we propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned according to the training set statistics. Thereafter, we apply this method on eight baselines in total to test its effectiveness. From the results on four datasets regarding the above three tasks, our method yields remarkable performance improvements compared with the baselines, demonstrating its superiority on reducing the modality bias problem.
Abstract（参考訳）: マルチモーダルデータにおける各モダリティを寄与させることは、多目的マルチモーダルモデルを学ぶ上で極めて重要である。しかし、既存の手法はモデルトレーニングの間、しばしば1つまたは少数のモダリティによって支配され、結果として準最適性能をもたらす。本稿では,この問題をモダリティバイアスと呼び,マルチモーダル分類を体系的かつ包括的に研究しようとする。いくつかの経験的分析を踏み込んだ結果、このモジュラリティがインスタンスラベルと突発的な相関を持つため、一つのモジュラリティがモデル予測にもっと影響を与えていることが判明した。主にモダリティバイアス問題の評価を容易にするために,色付き数字認識タスクと映像行動認識タスクの2つのデータセットを,OoD(Out-of-Distribution)プロトコルに従って構築する。視覚的質問応答タスクにおけるベンチマークと協調することにより,oodデータセットにおける既存手法の性能低下を実証的に正当化し,モダリティバイアス学習を正当化する証拠となる。さらに,この問題を解決するために,各ラベルの特徴空間をトレーニングセット統計に基づいて適応的に学習するプラグアンドプレイ損失関数法を提案する。その後,本手法を8つのベースラインに適用し,本手法の有効性を検証した。上記の3つの課題に関する4つのデータセットの結果から,本手法はベースラインと比較して顕著な性能向上を実現し,モダリティバイアス問題を低減した。

関連論文リスト

Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
現実のアプリケーションは、プライバシの懸念、効率性の必要性、ハードウェアの問題により、不完全なモダリティを伴う問題に直面することが多い。再トレーニングを必要とせずに,テスト時にこの問題に対処する新しい手法を提案する。 MiDlは、欠落したモダリティをテスト時にのみ扱う、自己管理型のオンラインソリューションとしては初めてのものだ。
論文参考訳（メタデータ） (2024-04-23T16:01:33Z)
Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity [9.811378971225727]
本稿では、欠落したモダリティに関する現在の研究を低データ体制に拡張する。フルモダリティデータと十分なアノテートされたトレーニングサンプルを取得することは、しばしばコストがかかる。本稿では,この2つの重要な問題に対処するために,検索強化したテキスト内学習を提案する。
論文参考訳（メタデータ） (2024-03-14T14:19:48Z)
Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
我々は、マルチモーダルデータのための因果グラフにおいて、共同創設者から生じるバイアスを研究する。ロバストな予測機能は、モデルがアウト・オブ・ディストリビューションデータに一般化するのに役立つ多様な情報を含んでいる。これらの特徴を共同設立者表現として使用し、因果理論によって動機づけられた手法を用いてモデルからバイアスを取り除く。
論文参考訳（メタデータ） (2023-11-28T16:46:14Z)
Learning Unseen Modality Interaction [54.23533023883659]
マルチモーダル学習は、すべてのモダリティの組み合わせが訓練中に利用でき、クロスモーダル対応を学ぶことを前提としている。我々は、目に見えないモダリティ相互作用の問題を提起し、第1の解を導入する。異なるモジュラリティの多次元的特徴を、豊富な情報を保存した共通空間に投影するモジュールを利用する。
論文参考訳（メタデータ） (2023-06-22T10:53:10Z)
Self-attention fusion for audiovisual emotion recognition with incomplete data [103.70855797025689]
視覚的感情認識を応用したマルチモーダルデータ解析の問題点を考察する。本稿では、生データから学習可能なアーキテクチャを提案し、その3つの変種を異なるモダリティ融合機構で記述する。
論文参考訳（メタデータ） (2022-01-26T18:04:29Z)
End-to-End Training of CNN Ensembles for Person Re-Identification [0.0]
本稿では,識別モデルにおける過剰適合問題に対処するため,個人再識別のためのエンドツーエンドアンサンブル手法を提案する。提案するアンサンブル学習フレームワークは,1つのDenseNetで多種多様な,正確なベースラーニングを行う。いくつかのベンチマークデータセットを用いた実験により,本手法が最先端の結果を得ることを示す。
論文参考訳（メタデータ） (2020-10-03T12:40:13Z)
Rank-Based Multi-task Learning for Fair Regression [9.95899391250129]
バイアス付きデータセットに基づくマルチタスク回帰モデルのための新しい学習手法を開発した。一般的な非パラメトリックオラクルベースの非ワールド乗算器データセットを使用します。
論文参考訳（メタデータ） (2020-09-23T22:32:57Z)
Task-Feature Collaborative Learning with Application to Personalized Attribute Prediction [166.87111665908333]
本稿では,TFCL(Task-Feature Collaborative Learning)と呼ばれる新しいマルチタスク学習手法を提案する。具体的には、まず、特徴とタスクの協調的なグループ化を活用するために、不均一なブロック対角構造正規化器を用いたベースモデルを提案する。実際の拡張として,重なり合う機能と難易度を区別することで,基本モデルを拡張します。
論文参考訳（メタデータ） (2020-04-29T02:32:04Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。