Targeted aspect based multimodal sentiment analysis:an attention capsule
extraction and multi-head fusion network
- URL: http://arxiv.org/abs/2103.07659v1
- Date: Sat, 13 Mar 2021 09:11:24 GMT
- Title: Targeted aspect based multimodal sentiment analysis:an attention capsule
extraction and multi-head fusion network
- Authors: Jiaqian Wang, Donghong Gu, Chi Yang, Yun Xue, Zhengxin Song, Haoliang
Zhao, Luwei Xiao
- Abstract summary: We propose the targeted aspect-based multimodal sentiment analysis (TABMSA) for the first time.
An attention capsule extraction and multi-head fusion network (EF-Net) on the task of TABMSA is devised.
We evaluate the proposed model on two manually annotated datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Multimodal sentiment analysis has currently identified its significance in a
variety of domains. For the purpose of sentiment analysis, different aspects of
distinguishing modalities, which correspond to one target, are processed and
analyzed. In this work, we propose the targeted aspect-based multimodal
sentiment analysis (TABMSA) for the first time. Furthermore, an attention
capsule extraction and multi-head fusion network (EF-Net) on the task of TABMSA
is devised. The multi-head attention (MHA) based network and the ResNet-152 are
employed to deal with texts and images, respectively. The integration of MHA
and capsule network aims to capture the interaction among the multimodal
inputs. In addition to the targeted aspect, the information from the context
and the image is also incorporated for sentiment delivered. We evaluate the
proposed model on two manually annotated datasets. the experimental results
demonstrate the effectiveness of our proposed model for this new task.
Related papers
- GCM-Net: Graph-enhanced Cross-Modal Infusion with a Metaheuristic-Driven Network for Video Sentiment and Emotion Analysis [2.012311338995539]
This paper presents a novel framework that leverages the multi-modal contextual information from utterances and applies metaheuristic algorithms to learn for utterance-level sentiment and emotion prediction.
To show the effectiveness of our approach, we have conducted extensive evaluations on three prominent multimodal benchmark datasets.
arXiv Detail & Related papers (2024-10-02T10:07:48Z) - A Novel Energy based Model Mechanism for Multi-modal Aspect-Based
Sentiment Analysis [85.77557381023617]
We propose a novel framework called DQPSA for multi-modal sentiment analysis.
PDQ module uses the prompt as both a visual query and a language query to extract prompt-aware visual information.
EPE module models the boundaries pairing of the analysis target from the perspective of an Energy-based Model.
arXiv Detail & Related papers (2023-12-13T12:00:46Z) - UniSA: Unified Generative Framework for Sentiment Analysis [48.78262926516856]
Sentiment analysis aims to understand people's emotional states and predict emotional categories based on multimodal information.
It consists of several subtasks, such as emotion recognition in conversation (ERC), aspect-based sentiment analysis (ABSA), and multimodal sentiment analysis (MSA)
arXiv Detail & Related papers (2023-09-04T03:49:30Z) - Multi-Grained Multimodal Interaction Network for Entity Linking [65.30260033700338]
Multimodal entity linking task aims at resolving ambiguous mentions to a multimodal knowledge graph.
We propose a novel Multi-GraIned Multimodal InteraCtion Network $textbf(MIMIC)$ framework for solving the MEL task.
arXiv Detail & Related papers (2023-07-19T02:11:19Z) - Sequential Late Fusion Technique for Multi-modal Sentiment Analysis [0.0]
We use text, audio and visual modalities from MOSI dataset.
We propose a novel fusion technique using a multi-head attention LSTM network.
arXiv Detail & Related papers (2021-06-22T01:32:41Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z) - Transformer-based Multi-Aspect Modeling for Multi-Aspect Multi-Sentiment
Analysis [56.893393134328996]
We propose a novel Transformer-based Multi-aspect Modeling scheme (TMM), which can capture potential relations between multiple aspects and simultaneously detect the sentiment of all aspects in a sentence.
Our method achieves noticeable improvements compared with strong baselines such as BERT and RoBERTa.
arXiv Detail & Related papers (2020-11-01T11:06:31Z) - MISA: Modality-Invariant and -Specific Representations for Multimodal
Sentiment Analysis [48.776247141839875]
We propose a novel framework, MISA, which projects each modality to two distinct subspaces.
The first subspace is modality-invariant, where the representations across modalities learn their commonalities and reduce the modality gap.
Our experiments on popular sentiment analysis benchmarks, MOSI and MOSEI, demonstrate significant gains over state-of-the-art models.
arXiv Detail & Related papers (2020-05-07T15:13:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.