Knowledge-Enhanced Hierarchical Information Correlation Learning for
Multi-Modal Rumor Detection
- URL: http://arxiv.org/abs/2306.15946v1
- Date: Wed, 28 Jun 2023 06:08:20 GMT
- Title: Knowledge-Enhanced Hierarchical Information Correlation Learning for
Multi-Modal Rumor Detection
- Authors: Jiawei Liu, Jingyi Xie, Fanrui Zhang, Qiang Zhang, Zheng-jun Zha
- Abstract summary: We propose a novel knowledge-enhanced hierarchical information correlation learning approach (KhiCL) for multi-modal rumor detection.
KhiCL exploits cross-modal joint dictionary to transfer the heterogeneous unimodality features into the common feature space.
It extracts visual and textual entities from images and text, and designs a knowledge relevance reasoning strategy.
- Score: 82.94413676131545
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The explosive growth of rumors with text and images on social media platforms
has drawn great attention. Existing studies have made significant contributions
to cross-modal information interaction and fusion, but they fail to fully
explore hierarchical and complex semantic correlation across different modality
content, severely limiting their performance on detecting multi-modal rumor. In
this work, we propose a novel knowledge-enhanced hierarchical information
correlation learning approach (KhiCL) for multi-modal rumor detection by
jointly modeling the basic semantic correlation and high-order
knowledge-enhanced entity correlation. Specifically, KhiCL exploits cross-modal
joint dictionary to transfer the heterogeneous unimodality features into the
common feature space and captures the basic cross-modal semantic consistency
and inconsistency by a cross-modal fusion layer. Moreover, considering the
description of multi-modal content is narrated around entities, KhiCL extracts
visual and textual entities from images and text, and designs a knowledge
relevance reasoning strategy to find the shortest semantic relevant path
between each pair of entities in external knowledge graph, and absorbs all
complementary contextual knowledge of other connected entities in this path for
learning knowledge-enhanced entity representations. Furthermore, KhiCL utilizes
a signed attention mechanism to model the knowledge-enhanced entity consistency
and inconsistency of intra-modality and inter-modality entity pairs by
measuring their corresponding semantic relevant distance. Extensive experiments
have demonstrated the effectiveness of the proposed method.
Related papers
- Re-mine, Learn and Reason: Exploring the Cross-modal Semantic
Correlations for Language-guided HOI detection [57.13665112065285]
Human-Object Interaction (HOI) detection is a challenging computer vision task.
We present a framework that enhances HOI detection by incorporating structured text knowledge.
arXiv Detail & Related papers (2023-07-25T14:20:52Z) - Multi-Grained Multimodal Interaction Network for Entity Linking [65.30260033700338]
Multimodal entity linking task aims at resolving ambiguous mentions to a multimodal knowledge graph.
We propose a novel Multi-GraIned Multimodal InteraCtion Network $textbf(MIMIC)$ framework for solving the MEL task.
arXiv Detail & Related papers (2023-07-19T02:11:19Z) - Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis [89.04041100520881]
This research proposes to retrieve textual and visual evidence based on the object, sentence, and whole image.
We develop a novel approach to synthesize the object-level, image-level, and sentence-level information for better reasoning between the same and different modalities.
arXiv Detail & Related papers (2023-05-25T15:26:13Z) - CLIP-Driven Fine-grained Text-Image Person Re-identification [50.94827165464813]
TIReID aims to retrieve the image corresponding to the given text query from a pool of candidate images.
We propose a CLIP-driven Fine-grained information excavation framework (CFine) to fully utilize the powerful knowledge of CLIP for TIReID.
arXiv Detail & Related papers (2022-10-19T03:43:12Z) - Learning Attention-based Representations from Multiple Patterns for
Relation Prediction in Knowledge Graphs [2.4028383570062606]
AEMP is a novel model for learning contextualized representations by acquiring entities' context information.
AEMP either outperforms or competes with state-of-the-art relation prediction methods.
arXiv Detail & Related papers (2022-06-07T10:53:35Z) - VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal
Document Classification [3.7798600249187295]
Multimodal learning from document data has achieved great success lately as it allows to pre-train semantically meaningful features as a prior into a learnable downstream task.
In this paper, we approach the document classification problem by learning cross-modal representations through language and vision cues.
The proposed method exploits high-level interactions and learns relevant semantic information from effective attention flows within and across modalities.
arXiv Detail & Related papers (2022-05-24T12:28:12Z) - Hierarchical Cross-Modality Semantic Correlation Learning Model for
Multimodal Summarization [4.714335699701277]
Multimodal summarization with multimodal output (MSMO) generates a summary with both textual and visual content.
Traditional MSMO methods indistinguishably handle different modalities of data by learning a representation for the whole data.
We propose a hierarchical cross-modality semantic correlation learning model (HCSCL) to learn the intra- and inter-modal correlation existing in the multimodal data.
arXiv Detail & Related papers (2021-12-16T01:46:30Z) - Learning Relation Alignment for Calibrated Cross-modal Retrieval [52.760541762871505]
We propose a novel metric, Intra-modal Self-attention Distance (ISD), to quantify the relation consistency by measuring the semantic distance between linguistic and visual relations.
We present Inter-modal Alignment on Intra-modal Self-attentions (IAIS), a regularized training method to optimize the ISD and calibrate intra-modal self-attentions mutually via inter-modal alignment.
arXiv Detail & Related papers (2021-05-28T14:25:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.