Hybrid Multimodal Fusion for Humor Detection
- URL: http://arxiv.org/abs/2209.11949v1
- Date: Sat, 24 Sep 2022 07:45:04 GMT
- Title: Hybrid Multimodal Fusion for Humor Detection
- Authors: Haojie Xu, Weifeng Liu, Jingwei Liu, Mingzheng Li, Yu Feng, Yasi Peng,
Yunwei Shi, Xiao Sun and Meng Wang
- Abstract summary: We present our solution to the MuSe-Humor sub-challenge of the Multimodal Emotional Challenge (MuSe) 2022.
The goal of the MuSe-Humor sub-challenge is to detect humor and calculate AUC from audiovisual recordings of German football Bundesliga press conferences.
- Score: 16.178078156094067
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present our solution to the MuSe-Humor sub-challenge of the
Multimodal Emotional Challenge (MuSe) 2022. The goal of the MuSe-Humor
sub-challenge is to detect humor and calculate AUC from audiovisual recordings
of German football Bundesliga press conferences. It is annotated for humor
displayed by the coaches. For this sub-challenge, we first build a discriminant
model using the transformer module and BiLSTM module, and then propose a hybrid
fusion strategy to use the prediction results of each modality to improve the
performance of the model. Our experiments demonstrate the effectiveness of our
proposed model and hybrid fusion strategy on multimodal fusion, and the AUC of
our proposed model on the test set is 0.8972.
Related papers
- DepMamba: Progressive Fusion Mamba for Multimodal Depression Detection [37.701518424351505]
Depression is a common mental disorder that affects millions of people worldwide.
We propose an audio-visual progressive fusion Mamba for multimodal depression detection, termed DepMamba.
arXiv Detail & Related papers (2024-09-24T09:58:07Z) - SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition [65.19303535139453]
We present our winning approach for the MER-NOISE and MER-OV tracks of the MER2024 Challenge on multimodal emotion recognition.
Our system leverages the advanced emotional understanding capabilities of Emotion-LLaMA to generate high-quality annotations for unlabeled samples.
For the MER-OV track, our utilization of Emotion-LLaMA for open-vocabulary annotation yields an 8.52% improvement in average accuracy and recall compared to GPT-4V.
arXiv Detail & Related papers (2024-08-20T02:46:03Z) - CANAMRF: An Attention-Based Model for Multimodal Depression Detection [7.266707571724883]
We present a Cross-modal Attention Network with Adaptive Multi-modal Recurrent Fusion (CANAMRF) for multimodal depression detection.
CANAMRF is constructed by a multimodal feature extractor, an Adaptive Multimodal Recurrent Fusion module, and a Hybrid Attention Module.
arXiv Detail & Related papers (2024-01-04T12:08:16Z) - MMoE: Enhancing Multimodal Models with Mixtures of Multimodal Interaction Experts [92.76662894585809]
We introduce an approach to enhance multimodal models, which we call Multimodal Mixtures of Experts (MMoE)
MMoE is able to be applied to various types of models to gain improvement.
arXiv Detail & Related papers (2023-11-16T05:31:21Z) - Provable Dynamic Fusion for Low-Quality Multimodal Data [94.39538027450948]
Dynamic multimodal fusion emerges as a promising learning paradigm.
Despite its widespread use, theoretical justifications in this field are still notably lacking.
This paper provides theoretical understandings to answer this question under a most popular multimodal fusion framework from the generalization perspective.
A novel multimodal fusion framework termed Quality-aware Multimodal Fusion (QMF) is proposed, which can improve the performance in terms of classification accuracy and model robustness.
arXiv Detail & Related papers (2023-06-03T08:32:35Z) - Equivariant Multi-Modality Image Fusion [124.11300001864579]
We propose the Equivariant Multi-Modality imAge fusion paradigm for end-to-end self-supervised learning.
Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations.
Experiments confirm that EMMA yields high-quality fusion results for infrared-visible and medical images.
arXiv Detail & Related papers (2023-05-19T05:50:24Z) - DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion [144.9653045465908]
We propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM)
Our approach yields promising fusion results in infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2023-03-13T04:06:42Z) - Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment
Analysis [31.097398034974436]
We present our solutions for the Multimodal Sentiment Analysis Challenge (MuSe) 2022, which includes MuSe-Humor, MuSe-Reaction and MuSe-Stress Sub-challenges.
The MuSe 2022 focuses on humor detection, emotional reactions and multimodal emotional stress utilising different modalities and data sets.
arXiv Detail & Related papers (2022-08-05T09:07:58Z) - Hateful Memes Challenge: An Enhanced Multimodal Framework [0.0]
Hateful Meme Challenge proposed by Facebook AI has attracted contestants around the world.
Various state-of-the-art deep learning models have been applied to this problem.
In this paper, we enhance the hateful detection framework, including utilizing Detectron for feature extraction.
arXiv Detail & Related papers (2021-12-20T07:47:17Z) - H2NF-Net for Brain Tumor Segmentation using Multimodal MR Imaging: 2nd
Place Solution to BraTS Challenge 2020 Segmentation Task [96.49879910148854]
Our H2NF-Net uses the single and cascaded HNF-Nets to segment different brain tumor sub-regions.
We trained and evaluated our model on the Multimodal Brain Tumor Challenge (BraTS) 2020 dataset.
Our method won the second place in the BraTS 2020 challenge segmentation task out of nearly 80 participants.
arXiv Detail & Related papers (2020-12-30T20:44:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.