論文の概要: Multimodal Fusion Interactions: A Study of Human and Automatic
- arxiv url: http://arxiv.org/abs/2306.04125v1
- Date: Wed, 7 Jun 2023 03:44:50 GMT
- ステータス: 処理完了
- システム内更新日: 2023-06-08 16:16:35.450282
- Title: Multimodal Fusion Interactions: A Study of Human and Automatic
- Title(参考訳): マルチモーダル核融合相互作用:人間と自動定量化の研究
- Authors: Paul Pu Liang, Yun Cheng, Ruslan Salakhutdinov, Louis-Philippe Morency
- Abstract要約: マルチモーダル相互作用の2つの分類をアノテートするために、人間のアノテータをどのように活用できるかを示す。
- 参考スコア(独自算出の注目度): 128.59151550085275
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal fusion of multiple heterogeneous and interconnected signals is a
fundamental challenge in almost all multimodal problems and applications. In
order to perform multimodal fusion, we need to understand the types of
interactions that modalities can exhibit: how each modality individually
provides information useful for a task and how this information changes in the
presence of other modalities. In this paper, we perform a comparative study of
how human annotators can be leveraged to annotate two categorizations of
multimodal interactions: (1) partial labels, where different randomly assigned
annotators annotate the label given the first, second, and both modalities, and
(2) counterfactual labels, where the same annotator is tasked to annotate the
label given the first modality before giving them the second modality and
asking them to explicitly reason about how their answer changes, before
proposing an alternative taxonomy based on (3) information decomposition, where
annotators annotate the degrees of redundancy: the extent to which modalities
individually and together give the same predictions on the task, uniqueness:
the extent to which one modality enables a task prediction that the other does
not, and synergy: the extent to which only both modalities enable one to make a
prediction about the task that one would not otherwise make using either
modality individually. Through extensive experiments and annotations, we
highlight several opportunities and limitations of each approach and propose a
method to automatically convert annotations of partial and counterfactual
labels to information decomposition, yielding an accurate and efficient method
for quantifying interactions in multimodal datasets.
- Abstract(参考訳): 多重異種信号と相互接続信号のマルチモーダル融合は、ほとんど全てのマルチモーダル問題や応用において根本的な課題である。
マルチモーダル融合を行うには、モダリティが提示できる相互作用の種類を理解する必要がある: それぞれのモダリティがタスクにどのように役立つ情報を提供するか、そして、その情報が他のモダリティの存在下でどのように変化するか。
In this paper, we perform a comparative study of how human annotators can be leveraged to annotate two categorizations of multimodal interactions: (1) partial labels, where different randomly assigned annotators annotate the label given the first, second, and both modalities, and (2) counterfactual labels, where the same annotator is tasked to annotate the label given the first modality before giving them the second modality and asking them to explicitly reason about how their answer changes, before proposing an alternative taxonomy based on (3) information decomposition, where annotators annotate the degrees of redundancy: the extent to which modalities individually and together give the same predictions on the task, uniqueness: the extent to which one modality enables a task prediction that the other does not, and synergy: the extent to which only both modalities enable one to make a prediction about the task that one would not otherwise make using either modality individually.
- Incomplete Multi-view Multi-label Classification via a Dual-level Contrastive Learning Framework [1.224954637705144]
論文 参考訳(メタデータ) (2024-11-27T12:04:04Z) - Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment [7.147651976133246]
Pseudo-label Multimodal Entity Alignment (PCMEA) を半教師付き方式で導入する。
論文 参考訳(メタデータ) (2024-03-02T12:44:59Z) - Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications [90.6849884683226]
論文 参考訳(メタデータ) (2023-06-07T15:44:53Z) - Variational Distillation for Multi-View Learning [104.17551354374821]
論文 参考訳(メタデータ) (2022-06-20T03:09:46Z) - High-Modality Multimodal Transformer: Quantifying Modality & Interaction
Heterogeneity for High-Modality Representation Learning [112.51498431119616]
論文 参考訳(メタデータ) (2022-03-02T18:56:20Z) - Single versus Multiple Annotation for Named Entity Recognition of
Mutations [4.213427823201119]
論文 参考訳(メタデータ) (2021-01-19T03:54:17Z) - Interactive Fusion of Multi-level Features for Compositional Activity
Recognition [100.75045558068874]
論文 参考訳(メタデータ) (2020-12-10T14:17:18Z) - Self-Attention Attribution: Interpreting Information Interactions Inside
Transformer [89.21584915290319]
本研究は,BERT に対する非目標攻撃の実装において,その属性を敵対パターンとして用いることができることを示す。
論文 参考訳(メタデータ) (2020-04-23T14:58:22Z)