HERGC: Heterogeneous Experts Representation and Generative Completion for Multimodal Knowledge Graphs
- URL: http://arxiv.org/abs/2506.00826v1
- Date: Sun, 01 Jun 2025 04:12:25 GMT
- Title: HERGC: Heterogeneous Experts Representation and Generative Completion for Multimodal Knowledge Graphs
- Authors: Yongkang Xiao, Rui Zhang,
- Abstract summary: Multi-modal knowledge graphs (MMKGs) enrich traditional knowledge graphs (KGs) by incorporating diverse modalities such as images and text.<n>MMKGC seeks to exploit these heterogeneous signals to infer missing facts, thereby mitigating the intrinsic incompleteness of MMKGs.<n>Recent generative completion approaches powered by advanced large language models (LLMs) have shown strong reasoning abilities in unimodal knowledge graph completion.<n>We propose HERGC, a Heterogeneous Experts Representation and Generative Completion framework for MMKGs.
- Score: 6.615362280237532
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal knowledge graphs (MMKGs) enrich traditional knowledge graphs (KGs) by incorporating diverse modalities such as images and text. Multi-modal knowledge graph completion (MMKGC) seeks to exploit these heterogeneous signals to infer missing facts, thereby mitigating the intrinsic incompleteness of MMKGs. Existing MMKGC methods typically leverage only the information contained in the MMKGs under the closed-world assumption and adopt discriminative training objectives, which limits their reasoning capacity during completion. Recent generative completion approaches powered by advanced large language models (LLMs) have shown strong reasoning abilities in unimodal knowledge graph completion, but their potential in MMKGC remains largely unexplored. To bridge this gap, we propose HERGC, a Heterogeneous Experts Representation and Generative Completion framework for MMKGs. HERGC first deploys a Heterogeneous Experts Representation Retriever that enriches and fuses multimodal information and retrieves a compact candidate set for each incomplete triple. It then uses a Generative LLM Predictor fine-tuned on minimal instruction data to accurately identify the correct answer from these candidates. Extensive experiments on three standard MMKG benchmarks demonstrate HERGC's effectiveness and robustness, achieving state-of-the-art performance.
Related papers
- Hyper-modal Imputation Diffusion Embedding with Dual-Distillation for Federated Multimodal Knowledge Graph Completion [59.54067771781552]
We propose a framework named MMFeD3-HidE for addressing multimodal uncertain unavailability and multimodal client heterogeneity challenges of FedMKGC.<n>We propose a FedMKGC benchmark for a comprehensive evaluation, consisting of a general FedMKGC backbone named MMFedE, datasets with heterogeneous multimodal information, and three groups of constructed baselines.
arXiv Detail & Related papers (2025-06-27T09:32:58Z) - MARIOH: Multiplicity-Aware Hypergraph Reconstruction [26.07529457537888]
We propose MARIOH, a supervised approach for reconstructing the original hypergraph from its projected graph by leveraging edge multiplicity.<n>In our experiments using 10 real-world datasets, MARIOH achieves up to 74.51% higher reconstruction accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-04-01T08:14:59Z) - GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation [84.41557981816077]
We introduce GFM-RAG, a novel graph foundation model (GFM) for retrieval augmented generation.<n>GFM-RAG is powered by an innovative graph neural network that reasons over graph structure to capture complex query-knowledge relationships.<n>It achieves state-of-the-art performance while maintaining efficiency and alignment with neural scaling laws.
arXiv Detail & Related papers (2025-02-03T07:04:29Z) - Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning [51.80447197290866]
Learning high-quality multi-modal entity representations is an important goal of multi-modal knowledge graph (MMKG) representation learning.<n>Existing methods focus on crafting elegant entity-wise multi-modal fusion strategies.<n>We introduce a novel framework with Mixture of Modality Knowledge experts (MoMoK) to learn adaptive multi-modal entity representations.
arXiv Detail & Related papers (2024-05-27T06:36:17Z) - Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering [87.67177556994525]
We propose a training-free method called Generate-on-Graph (GoG) to generate new factual triples while exploring Knowledge Graphs (KGs)
GoG performs reasoning through a Thinking-Searching-Generating framework, which treats LLM as both Agent and KG in IKGQA.
arXiv Detail & Related papers (2024-04-23T04:47:22Z) - Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation [51.80447197290866]
Multi-modal knowledge graph completion (MMKGC) aims to discover unobserved knowledge from given knowledge graphs.<n>Existing MMKGC methods usually extract multi-modal features with pre-trained models.<n>We introduce a novel framework MyGO to tokenize, fuse, and augment the fine-grained multi-modal representations of entities.
arXiv Detail & Related papers (2024-04-15T05:40:41Z) - Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey [61.8716670402084]
This survey focuses on KG-aware research in two principal aspects: KG-driven Multi-Modal (KG4MM) learning, and Multi-Modal Knowledge Graph (MM4KG)
Our review includes two primary task categories: KG-aware multi-modal learning tasks, and intrinsic MMKG tasks.
For most of these tasks, we provide definitions, evaluation benchmarks, and additionally outline essential insights for conducting relevant research.
arXiv Detail & Related papers (2024-02-08T04:04:36Z) - MACO: A Modality Adversarial and Contrastive Framework for
Modality-missing Multi-modal Knowledge Graph Completion [18.188971531961663]
We propose a modality adversarial and contrastive framework (MACO) to solve the modality-missing problem in MMKGC.
MACO trains a generator and discriminator adversarially to generate missing modality features that can be incorporated into the MMKGC model.
arXiv Detail & Related papers (2023-08-13T06:29:38Z) - Knowledge Graph Completion with Pre-trained Multimodal Transformer and
Twins Negative Sampling [13.016173217017597]
We propose a VisualBERT-enhanced Knowledge Graph Completion model (VBKGC) for short.
VBKGC could capture deeply fused multimodal information for entities and integrate them into the KGC model.
We conduct extensive experiments to show the outstanding performance of VBKGC on the link prediction task.
arXiv Detail & Related papers (2022-09-15T06:50:31Z) - Multi-Modal Knowledge Graph Construction and Application: A Survey [17.203534055251435]
Multi-modalization of knowledge graphs is an inevitable key step towards the realization of human-level machine intelligence.
We first give definitions of MMKGs constructed by texts and images, followed with the preliminaries on multi-modal tasks and techniques.
We then systematically review the challenges, progresses and opportunities on the construction and application of MMKGs respectively, with detailed analyses of the strength and weakness of different solutions.
arXiv Detail & Related papers (2022-02-11T17:31:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.