Related papers: HERGC: Heterogeneous Experts Representation and Generative Completion for Multimodal Knowledge Graphs

HERGC: Heterogeneous Experts Representation and Generative Completion for Multimodal Knowledge Graphs

URL: http://arxiv.org/abs/2506.00826v2
Date: Fri, 08 Aug 2025 18:42:44 GMT
Title: HERGC: Heterogeneous Experts Representation and Generative Completion for Multimodal Knowledge Graphs
Authors: Yongkang Xiao, Rui Zhang,
Abstract summary: Multimodal knowledge graphs (MMKGs) enrich traditional knowledge graphs (KGs) by incorporating diverse modalities such as images and text.<n> multimodal knowledge graph completion (MMKGC) seeks to exploit these heterogeneous signals to infer missing facts.<n> HERGC is a flexible Heterogeneous Experts Representation and Generative Completion framework for MMKGs.
Score: 6.615362280237532
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multimodal knowledge graphs (MMKGs) enrich traditional knowledge graphs (KGs) by incorporating diverse modalities such as images and text. multimodal knowledge graph completion (MMKGC) seeks to exploit these heterogeneous signals to infer missing facts, thereby mitigating the intrinsic incompleteness of MMKGs. Existing MMKGC methods typically leverage only the information contained in the MMKGs under the closed-world assumption and adopt discriminative training objectives, which limits their reasoning capacity during completion. Recent large language models (LLMs), empowered by massive parameter scales and pretraining on vast corpora, have demonstrated strong reasoning abilities across various tasks. However, their potential in MMKGC remains largely unexplored. To bridge this gap, we propose HERGC, a flexible Heterogeneous Experts Representation and Generative Completion framework for MMKGs. HERGC first deploys a Heterogeneous Experts Representation Retriever that enriches and fuses multimodal information and retrieves a compact candidate set for each incomplete triple. It then uses a Generative LLM Predictor, implemented via either in-context learning or lightweight fine-tuning, to accurately identify the correct answer from these candidates. Extensive experiments on three standard MMKG benchmarks demonstrate HERGC's effectiveness and robustness, achieving superior performance over existing methods.

Related papers

Every Little Helps: Building Knowledge Graph Foundation Model with Fine-grained Transferable Multi-modal Tokens [60.15844119489298]
Multi-modal knowledge graph reasoning (MMKGR) aims to predict the missing links by exploiting both graph structure information and multi-modal entity contents.<n>We propose a token-based foundation model (TOFU) for MMKGR, which exhibits strong generalization across different MMKGs.<n> Experimental results on 17 transductive, inductive, and fully-inductive MMKGs show that TOFU consistently outperforms strong KGFM and MMKGR baselines.
arXiv Detail & Related papers (2026-02-11T13:32:09Z)
M$^3$KG-RAG: Multi-hop Multimodal Knowledge Graph-enhanced Retrieval-Augmented Generation [20.170643730917963]
M$3$KG-RAG is a Multi-hop Multimodal Knowledge Graph-enhanced RAG.<n>It retrieves query-aligned audio-visual knowledge from MMKGs.<n>It improves reasoning depth and answer faithfulness in MLLMs.
arXiv Detail & Related papers (2025-12-23T07:54:03Z)
Hyper-modal Imputation Diffusion Embedding with Dual-Distillation for Federated Multimodal Knowledge Graph Completion [59.54067771781552]
We propose a framework named MMFeD3-HidE for addressing multimodal uncertain unavailability and multimodal client heterogeneity challenges of FedMKGC.<n>We propose a FedMKGC benchmark for a comprehensive evaluation, consisting of a general FedMKGC backbone named MMFedE, datasets with heterogeneous multimodal information, and three groups of constructed baselines.
arXiv Detail & Related papers (2025-06-27T09:32:58Z)
MARIOH: Multiplicity-Aware Hypergraph Reconstruction [26.07529457537888]
We propose MARIOH, a supervised approach for reconstructing the original hypergraph from its projected graph by leveraging edge multiplicity.<n>In our experiments using 10 real-world datasets, MARIOH achieves up to 74.51% higher reconstruction accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-04-01T08:14:59Z)
Symbolic Mixture-of-Experts: Adaptive Skill-based Routing for Heterogeneous Reasoning [76.10639521319382]
We propose Symbolic-MoE, a symbolic, text-based, and gradient-free Mixture-of-Experts framework.<n>We show Symbolic-MoE beats strong LLMs like GPT4o-mini, as well as multi-agent approaches, with an absolute avg. gain of 8.15% over the best multi-agent baseline.
arXiv Detail & Related papers (2025-03-07T18:03:13Z)
Convergence Rates for Softmax Gating Mixture of Experts [78.3687645289918]
Mixture of experts (MoE) has emerged as an effective framework to advance the efficiency and scalability of machine learning models.<n>Central to the success of MoE is an adaptive softmax gating mechanism which takes responsibility for determining the relevance of each expert to a given input and then dynamically assigning experts their respective weights.<n>We perform a convergence analysis of parameter estimation and expert estimation under the MoE equipped with the standard softmax gating or its variants, including a dense-to-sparse gating and a hierarchical softmax gating.
arXiv Detail & Related papers (2025-03-05T06:11:24Z)
GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation [84.41557981816077]
We introduce GFM-RAG, a novel graph foundation model (GFM) for retrieval augmented generation.<n>GFM-RAG is powered by an innovative graph neural network that reasons over graph structure to capture complex query-knowledge relationships.<n>It achieves state-of-the-art performance while maintaining efficiency and alignment with neural scaling laws.
arXiv Detail & Related papers (2025-02-03T07:04:29Z)
Multimodal Reasoning with Multimodal Knowledge Graph [19.899398342533722]
Multimodal reasoning with large language models (LLMs) often suffers from hallucinations and the presence of deficient or outdated knowledge. We propose the Multimodal Reasoning with Multimodal Knowledge Graph (MR-MKG) method to learn rich and semantic knowledge across modalities.
arXiv Detail & Related papers (2024-06-04T07:13:23Z)
Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning [51.80447197290866]
Learning high-quality multi-modal entity representations is an important goal of multi-modal knowledge graph (MMKG) representation learning.<n>Existing methods focus on crafting elegant entity-wise multi-modal fusion strategies.<n>We introduce a novel framework with Mixture of Modality Knowledge experts (MoMoK) to learn adaptive multi-modal entity representations.
arXiv Detail & Related papers (2024-05-27T06:36:17Z)
Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering [87.67177556994525]
We propose a training-free method called Generate-on-Graph (GoG) to generate new factual triples while exploring Knowledge Graphs (KGs) GoG performs reasoning through a Thinking-Searching-Generating framework, which treats LLM as both Agent and KG in IKGQA.
arXiv Detail & Related papers (2024-04-23T04:47:22Z)
Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation [51.80447197290866]
Multi-modal knowledge graph completion (MMKGC) aims to discover unobserved knowledge from given knowledge graphs.<n>Existing MMKGC methods usually extract multi-modal features with pre-trained models.<n>We introduce a novel framework MyGO to tokenize, fuse, and augment the fine-grained multi-modal representations of entities.
arXiv Detail & Related papers (2024-04-15T05:40:41Z)
Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey [61.8716670402084]
This survey focuses on KG-aware research in two principal aspects: KG-driven Multi-Modal (KG4MM) learning, and Multi-Modal Knowledge Graph (MM4KG) Our review includes two primary task categories: KG-aware multi-modal learning tasks, and intrinsic MMKG tasks. For most of these tasks, we provide definitions, evaluation benchmarks, and additionally outline essential insights for conducting relevant research.
arXiv Detail & Related papers (2024-02-08T04:04:36Z)
MACO: A Modality Adversarial and Contrastive Framework for Modality-missing Multi-modal Knowledge Graph Completion [18.188971531961663]
We propose a modality adversarial and contrastive framework (MACO) to solve the modality-missing problem in MMKGC. MACO trains a generator and discriminator adversarially to generate missing modality features that can be incorporated into the MMKGC model.
arXiv Detail & Related papers (2023-08-13T06:29:38Z)
Knowledge Graph Completion with Pre-trained Multimodal Transformer and Twins Negative Sampling [13.016173217017597]
We propose a VisualBERT-enhanced Knowledge Graph Completion model (VBKGC) for short. VBKGC could capture deeply fused multimodal information for entities and integrate them into the KGC model. We conduct extensive experiments to show the outstanding performance of VBKGC on the link prediction task.
arXiv Detail & Related papers (2022-09-15T06:50:31Z)
Multi-Modal Knowledge Graph Construction and Application: A Survey [17.203534055251435]
Multi-modalization of knowledge graphs is an inevitable key step towards the realization of human-level machine intelligence. We first give definitions of MMKGs constructed by texts and images, followed with the preliminaries on multi-modal tasks and techniques. We then systematically review the challenges, progresses and opportunities on the construction and application of MMKGs respectively, with detailed analyses of the strength and weakness of different solutions.
arXiv Detail & Related papers (2022-02-11T17:31:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.