Multimodal Graph Benchmark
- URL: http://arxiv.org/abs/2406.16321v1
- Date: Mon, 24 Jun 2024 05:14:09 GMT
- Title: Multimodal Graph Benchmark
- Authors: Jing Zhu, Yuhang Zhou, Shengyi Qian, Zhongmou He, Tong Zhao, Neil Shah, Danai Koutra,
- Abstract summary: Multimodal Graph Benchmark (MM-GRAPH) is first comprehensive multi-modal graph benchmark that incorporates both textual and visual information.
MM-GRAPH consists of five graph learning datasets of various scales that are appropriate for different learning tasks.
MM-GRAPH aims to foster research on multimodal graph learning and drive the development of more advanced and robust graph learning algorithms.
- Score: 36.75510196380185
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Associating unstructured data with structured information is crucial for real-world tasks that require relevance search. However, existing graph learning benchmarks often overlook the rich semantic information associate with each node. To bridge such gap, we introduce the Multimodal Graph Benchmark (MM-GRAPH), the first comprehensive multi-modal graph benchmark that incorporates both textual and visual information. MM-GRAPH surpasses previous efforts, which have primarily focused on text-attributed graphs with various connectivity patterns. MM-GRAPH consists of five graph learning datasets of various scales that are appropriate for different learning tasks. Their multimodal node features, enabling a more comprehensive evaluation of graph learning algorithms in real-world scenarios. To facilitate research on multimodal graph learning, we further provide an extensive study on the performance of various graph neural networks in the presence of features from various modalities. MM-GRAPH aims to foster research on multimodal graph learning and drive the development of more advanced and robust graph learning algorithms. By providing a diverse set of datasets and benchmarks, MM-GRAPH enables researchers to evaluate and compare their models in realistic settings, ultimately leading to improved performance on real-world applications that rely on multimodal graph data.
Related papers
- When Graph meets Multimodal: Benchmarking on Multimodal Attributed Graphs Learning [36.6581535146878]
Multimodal attributed graphs (MAGs) are prevalent in various real-world scenarios and generally contain two kinds of knowledge.
Recent advancements in Pre-trained Language/Vision models (PLMs/PVMs) and Graph neural networks (GNNs) have facilitated effective learning on MAGs.
We propose Multimodal Attribute Graph Benchmark (MAGB), a comprehensive and diverse collection of challenging benchmark datasets for MAGs.
arXiv Detail & Related papers (2024-10-11T13:24:57Z) - Exploring Graph Structure Comprehension Ability of Multimodal Large Language Models: Case Studies [7.067145619709089]
This study investigates the impact of graph visualisations on Large Language Models (LLMs) performance.
Our experiments compare the effectiveness of multimodal approaches against purely textual graph representations.
arXiv Detail & Related papers (2024-09-13T14:26:58Z) - Representation learning in multiplex graphs: Where and how to fuse
information? [5.0235828656754915]
Multiplex graphs possess richer information, provide better modeling capabilities and integrate more detailed data from potentially different sources.
In this paper, we tackle the problem of learning representations for nodes in multiplex networks in an unsupervised or self-supervised manner.
We propose improvements in how to construct GNN architectures that deal with multiplex graphs.
arXiv Detail & Related papers (2024-02-27T21:47:06Z) - Learning on Multimodal Graphs: A Survey [6.362513821299131]
Multimodal data pervades various domains, including healthcare, social media, and transportation.
multimodal graph learning (MGL) is essential for successful artificial intelligence (AI) applications.
arXiv Detail & Related papers (2024-02-07T23:50:00Z) - When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding
and Reasoning [54.84870836443311]
The paper presents a new paradigm for understanding and reasoning about graph data by integrating image encoding and multimodal technologies.
This approach enables the comprehension of graph data through an instruction-response format, utilizing GPT-4V's advanced capabilities.
The study evaluates this paradigm on various graph types, highlighting the model's strengths and weaknesses, particularly in Chinese OCR performance and complex reasoning tasks.
arXiv Detail & Related papers (2023-12-16T08:14:11Z) - MMGA: Multimodal Learning with Graph Alignment [8.349066399479938]
We propose MMGA, a novel multimodal pre-training framework to incorporate information from graph (social network), image and text modalities on social media.
In MMGA, a multi-step graph alignment mechanism is proposed to add the self-supervision from graph modality to optimize the image and text encoders.
We release our dataset, the first social media multimodal dataset with graph, of 60,000 users labeled with specific topics based on 2 million posts to facilitate future research.
arXiv Detail & Related papers (2022-10-18T15:50:31Z) - Graph Pooling for Graph Neural Networks: Progress, Challenges, and
Opportunities [128.55790219377315]
Graph neural networks have emerged as a leading architecture for many graph-level tasks.
graph pooling is indispensable for obtaining a holistic graph-level representation of the whole graph.
arXiv Detail & Related papers (2022-04-15T04:02:06Z) - Group Contrastive Self-Supervised Learning on Graphs [101.45974132613293]
We study self-supervised learning on graphs using contrastive methods.
We argue that contrasting graphs in multiple subspaces enables graph encoders to capture more abundant characteristics.
arXiv Detail & Related papers (2021-07-20T22:09:21Z) - Model-Agnostic Graph Regularization for Few-Shot Learning [60.64531995451357]
We present a comprehensive study on graph embedded few-shot learning.
We introduce a graph regularization approach that allows a deeper understanding of the impact of incorporating graph information between labels.
Our approach improves the performance of strong base learners by up to 2% on Mini-ImageNet and 6.7% on ImageNet-FS.
arXiv Detail & Related papers (2021-02-14T05:28:13Z) - Multi-view Graph Learning by Joint Modeling of Consistency and
Inconsistency [65.76554214664101]
Graph learning has emerged as a promising technique for multi-view clustering with its ability to learn a unified and robust graph from multiple views.
We propose a new multi-view graph learning framework, which for the first time simultaneously models multi-view consistency and multi-view inconsistency in a unified objective function.
Experiments on twelve multi-view datasets have demonstrated the robustness and efficiency of the proposed approach.
arXiv Detail & Related papers (2020-08-24T06:11:29Z) - GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training [62.73470368851127]
Graph representation learning has emerged as a powerful technique for addressing real-world problems.
We design Graph Contrastive Coding -- a self-supervised graph neural network pre-training framework.
We conduct experiments on three graph learning tasks and ten graph datasets.
arXiv Detail & Related papers (2020-06-17T16:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.