Related papers: Mixture of Modality Knowledge Experts for Robust Multi-modal Knowledge Graph Completion

Mixture of Modality Knowledge Experts for Robust Multi-modal Knowledge Graph Completion

URL: http://arxiv.org/abs/2405.16869v1
Date: Mon, 27 May 2024 06:36:17 GMT
Title: Mixture of Modality Knowledge Experts for Robust Multi-modal Knowledge Graph Completion
Authors: Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Binbin Hu, Ziqi Liu, Wen Zhang, Huajun Chen,
Abstract summary: Multi-modal knowledge graph completion (MMKGC) aims to automatically discover new knowledge triples in the given multi-modal knowledge graphs (MMKGs) Existing methods tend to focus on crafting elegant entity-wise multi-modal fusion strategies, yet they overlook the utilization of multi-perspective features concealed within the modalities under diverse relational contexts. We introduce a novel MMKGC framework with Mixture of Modality Knowledge experts (MoMoK) to learn adaptive multi-modal embedding under intricate relational contexts.
Score: 51.80447197290866
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-modal knowledge graph completion (MMKGC) aims to automatically discover new knowledge triples in the given multi-modal knowledge graphs (MMKGs), which is achieved by collaborative modeling the structural information concealed in massive triples and the multi-modal features of the entities. Existing methods tend to focus on crafting elegant entity-wise multi-modal fusion strategies, yet they overlook the utilization of multi-perspective features concealed within the modalities under diverse relational contexts. To address this issue, we introduce a novel MMKGC framework with Mixture of Modality Knowledge experts (MoMoK for short) to learn adaptive multi-modal embedding under intricate relational contexts. We design relation-guided modality knowledge experts to acquire relation-aware modality embeddings and integrate the predictions from multi-modalities to achieve comprehensive decisions. Additionally, we disentangle the experts by minimizing their mutual information. Experiments on four public MMKG benchmarks demonstrate the outstanding performance of MoMoK under complex scenarios.

Related papers

Complementarity-driven Representation Learning for Multi-modal Knowledge Graph Completion [0.0]
We propose a novel framework named Mixture of Complementary Modality Experts (MoCME)<n>MoCME consists of a Complementarity-guided Modality Knowledge Fusion (CMKF) module and an Entropy-guided Negative Sampling (EGNS) mechanism.<n>Our MoCME achieves state-of-the-art performance, surpassing existing approaches.
arXiv Detail & Related papers (2025-07-28T08:35:11Z)
MIND: Modality-Informed Knowledge Distillation Framework for Multimodal Clinical Prediction Tasks [50.98856172702256]
We propose the Modality-INformed knowledge Distillation (MIND) framework, a multimodal model compression approach. MIND transfers knowledge from ensembles of pre-trained deep neural networks of varying sizes into a smaller multimodal student. We evaluate MIND on binary and multilabel clinical prediction tasks using time series data and chest X-ray images.
arXiv Detail & Related papers (2025-02-03T08:50:00Z)
PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents [58.35492519636351]
PIN format is built on three foundational principles: knowledge intensity, scalability, and support for diverse training modalities. We present PIN-14M, an open-source dataset comprising 14 million samples derived from a diverse range of Chinese and English sources.
arXiv Detail & Related papers (2024-06-20T01:43:08Z)
Multimodal Reasoning with Multimodal Knowledge Graph [19.899398342533722]
Multimodal reasoning with large language models (LLMs) often suffers from hallucinations and the presence of deficient or outdated knowledge. We propose the Multimodal Reasoning with Multimodal Knowledge Graph (MR-MKG) method to learn rich and semantic knowledge across modalities.
arXiv Detail & Related papers (2024-06-04T07:13:23Z)
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts [54.529880848937104]
We develop a unified MLLM with the MoE architecture, named Uni-MoE, that can handle a wide array of modalities. Specifically, it features modality-specific encoders with connectors for a unified multimodal representation. We evaluate the instruction-tuned Uni-MoE on a comprehensive set of multimodal datasets.
arXiv Detail & Related papers (2024-05-18T12:16:01Z)
MyGO: Discrete Modality Information as Fine-Grained Tokens for Multi-modal Knowledge Graph Completion [51.80447197290866]
We introduce MyGO to process, fuse, and augment the fine-grained modality information from MMKGs. MyGO tokenizes multi-modal raw data as fine-grained discrete tokens and learns entity representations with a cross-modal entity encoder. Experiments on standard MMKGC benchmarks reveal that our method surpasses 20 of the latest models.
arXiv Detail & Related papers (2024-04-15T05:40:41Z)
Zero-Shot Relational Learning for Multimodal Knowledge Graphs [31.215889061734295]
One of the major challenges is inference on newly discovered relations without associated training data. Existing works fail to support the leverage of multimodal information and leave the problem unexplored. We propose a novel end-to-end framework, consisting of three components, i.e., multimodal learner, structure consolidator embedding generator, to integrate diverse multimodal information and knowledge graph structures.
arXiv Detail & Related papers (2024-04-09T11:14:45Z)
NativE: Multi-modal Knowledge Graph Completion in the Wild [51.80447197290866]
We propose a comprehensive framework NativE to achieve MMKGC in the wild. NativE proposes a relation-guided dual adaptive fusion module that enables adaptive fusion for any modalities. We construct a new benchmark called WildKGC with five datasets to evaluate our method.
arXiv Detail & Related papers (2024-03-28T03:04:00Z)
Noise-powered Multi-modal Knowledge Graph Representation Framework [52.95468915728721]
The rise of Multi-modal Pre-training highlights the necessity for a unified Multi-Modal Knowledge Graph representation learning framework. We propose a novel SNAG method that utilizes a Transformer-based architecture equipped with modality-level noise masking. Our approach achieves SOTA performance across a total of ten datasets, demonstrating its versatility.
arXiv Detail & Related papers (2024-03-11T15:48:43Z)
Multi-modal Contrastive Representation Learning for Entity Alignment [57.92705405276161]
Multi-modal entity alignment aims to identify equivalent entities between two different multi-modal knowledge graphs. We propose MCLEA, a Multi-modal Contrastive Learning based Entity Alignment model. In particular, MCLEA firstly learns multiple individual representations from multiple modalities, and then performs contrastive learning to jointly model intra-modal and inter-modal interactions.
arXiv Detail & Related papers (2022-09-02T08:59:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.