Related papers: Multimodal Graph Benchmark

Multimodal Graph Benchmark

URL: http://arxiv.org/abs/2406.16321v1
Date: Mon, 24 Jun 2024 05:14:09 GMT
Title: Multimodal Graph Benchmark
Authors: Jing Zhu, Yuhang Zhou, Shengyi Qian, Zhongmou He, Tong Zhao, Neil Shah, Danai Koutra,
Abstract summary: Multimodal Graph Benchmark (MM-GRAPH) is first comprehensive multi-modal graph benchmark that incorporates both textual and visual information. MM-GRAPH consists of five graph learning datasets of various scales that are appropriate for different learning tasks. MM-GRAPH aims to foster research on multimodal graph learning and drive the development of more advanced and robust graph learning algorithms.
Score: 36.75510196380185
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Associating unstructured data with structured information is crucial for real-world tasks that require relevance search. However, existing graph learning benchmarks often overlook the rich semantic information associate with each node. To bridge such gap, we introduce the Multimodal Graph Benchmark (MM-GRAPH), the first comprehensive multi-modal graph benchmark that incorporates both textual and visual information. MM-GRAPH surpasses previous efforts, which have primarily focused on text-attributed graphs with various connectivity patterns. MM-GRAPH consists of five graph learning datasets of various scales that are appropriate for different learning tasks. Their multimodal node features, enabling a more comprehensive evaluation of graph learning algorithms in real-world scenarios. To facilitate research on multimodal graph learning, we further provide an extensive study on the performance of various graph neural networks in the presence of features from various modalities. MM-GRAPH aims to foster research on multimodal graph learning and drive the development of more advanced and robust graph learning algorithms. By providing a diverse set of datasets and benchmarks, MM-GRAPH enables researchers to evaluate and compare their models in realistic settings, ultimately leading to improved performance on real-world applications that rely on multimodal graph data.

Related papers

MLaGA: Multimodal Large Language and Graph Assistant [9.985787670804823]
Large Language Models (LLMs) have demonstrated substantial efficacy in advancing graph-structured data analysis.<n>We introduce the Multimodal Large Language and Graph Assistant (MLaGA), an innovative model that adeptly extends LLM capabilities to facilitate reasoning over complex graph structures and multimodal attributes.
arXiv Detail & Related papers (2025-06-03T07:52:00Z)
Graph-to-Vision: Multi-graph Understanding and Reasoning using Vision-Language Models [10.813015912529936]
We introduce the first comprehensive benchmark designed to evaluate and enhance the multi-graph reasoning abilities of Vision-Language Models (VLMs)<n>Our benchmark covers four common graph types-knowledge graphs, flowcharts, mind maps, and route maps-and supports both homogeneous and heterogeneous graph groupings.<n>We evaluate several state-of-the-art VLMs under a multi-dimensional scoring framework that assesses graph parsing, reasoning consistency, and instruction-following accuracy.
arXiv Detail & Related papers (2025-03-27T12:20:37Z)
UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs [34.48393396390799]
We propose a novel cross-domain graph foundation model that enables general representation learning on multimodal graphs. UniGraph2 employs modality-specific encoders alongside a graph neural network (GNN) to learn a unified low-dimensional embedding space. We show that UniGraph2 significantly outperforms state-of-the-art models in tasks such as representation learning, transfer learning, and multimodal generative tasks.
arXiv Detail & Related papers (2025-02-02T14:04:53Z)
When Graph meets Multimodal: Benchmarking on Multimodal Attributed Graphs Learning [36.6581535146878]
Multimodal attributed graphs (MAGs) are prevalent in various real-world scenarios and generally contain two kinds of knowledge. Recent advancements in Pre-trained Language/Vision models (PLMs/PVMs) and Graph neural networks (GNNs) have facilitated effective learning on MAGs. We propose Multimodal Attribute Graph Benchmark (MAGB), a comprehensive and diverse collection of challenging benchmark datasets for MAGs.
arXiv Detail & Related papers (2024-10-11T13:24:57Z)
Exploring Graph Structure Comprehension Ability of Multimodal Large Language Models: Case Studies [7.067145619709089]
This study investigates the impact of graph visualisations on Large Language Models (LLMs) performance. Our experiments compare the effectiveness of multimodal approaches against purely textual graph representations.
arXiv Detail & Related papers (2024-09-13T14:26:58Z)
Bridging Local Details and Global Context in Text-Attributed Graphs [62.522550655068336]
GraphBridge is a framework that bridges local and global perspectives by leveraging contextual textual information. Our method achieves state-of-theart performance, while our graph-aware token reduction module significantly enhances efficiency and solves scalability issues.
arXiv Detail & Related papers (2024-06-18T13:35:25Z)
Representation learning in multiplex graphs: Where and how to fuse information? [5.0235828656754915]
Multiplex graphs possess richer information, provide better modeling capabilities and integrate more detailed data from potentially different sources. In this paper, we tackle the problem of learning representations for nodes in multiplex networks in an unsupervised or self-supervised manner. We propose improvements in how to construct GNN architectures that deal with multiplex graphs.
arXiv Detail & Related papers (2024-02-27T21:47:06Z)
Learning on Multimodal Graphs: A Survey [6.362513821299131]
Multimodal data pervades various domains, including healthcare, social media, and transportation. multimodal graph learning (MGL) is essential for successful artificial intelligence (AI) applications.
arXiv Detail & Related papers (2024-02-07T23:50:00Z)
When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding and Reasoning [54.84870836443311]
The paper presents a new paradigm for understanding and reasoning about graph data by integrating image encoding and multimodal technologies. This approach enables the comprehension of graph data through an instruction-response format, utilizing GPT-4V's advanced capabilities. The study evaluates this paradigm on various graph types, highlighting the model's strengths and weaknesses, particularly in Chinese OCR performance and complex reasoning tasks.
arXiv Detail & Related papers (2023-12-16T08:14:11Z)
GPT4Graph: Can Large Language Models Understand Graph Structured Data ? An Empirical Evaluation and Benchmarking [17.7473474499538]
Large language models like ChatGPT have become indispensable to artificial general intelligence. In this study, we conduct an investigation to assess the proficiency of LLMs in comprehending graph data. Our findings contribute valuable insights towards bridging the gap between language models and graph understanding.
arXiv Detail & Related papers (2023-05-24T11:53:19Z)
Cross-view Graph Contrastive Representation Learning on Partially Aligned Multi-view Data [52.491074276133325]
Multi-view representation learning has developed rapidly over the past decades and has been applied in many fields. We propose a new cross-view graph contrastive learning framework, which integrates multi-view information to align data and learn latent representations. Experiments conducted on several real datasets demonstrate the effectiveness of the proposed method on the clustering and classification tasks.
arXiv Detail & Related papers (2022-11-08T09:19:32Z)
MMGA: Multimodal Learning with Graph Alignment [8.349066399479938]
We propose MMGA, a novel multimodal pre-training framework to incorporate information from graph (social network), image and text modalities on social media. In MMGA, a multi-step graph alignment mechanism is proposed to add the self-supervision from graph modality to optimize the image and text encoders. We release our dataset, the first social media multimodal dataset with graph, of 60,000 users labeled with specific topics based on 2 million posts to facilitate future research.
arXiv Detail & Related papers (2022-10-18T15:50:31Z)
Graph Pooling for Graph Neural Networks: Progress, Challenges, and Opportunities [128.55790219377315]
Graph neural networks have emerged as a leading architecture for many graph-level tasks. graph pooling is indispensable for obtaining a holistic graph-level representation of the whole graph.
arXiv Detail & Related papers (2022-04-15T04:02:06Z)
Group Contrastive Self-Supervised Learning on Graphs [101.45974132613293]
We study self-supervised learning on graphs using contrastive methods. We argue that contrasting graphs in multiple subspaces enables graph encoders to capture more abundant characteristics.
arXiv Detail & Related papers (2021-07-20T22:09:21Z)
Model-Agnostic Graph Regularization for Few-Shot Learning [60.64531995451357]
We present a comprehensive study on graph embedded few-shot learning. We introduce a graph regularization approach that allows a deeper understanding of the impact of incorporating graph information between labels. Our approach improves the performance of strong base learners by up to 2% on Mini-ImageNet and 6.7% on ImageNet-FS.
arXiv Detail & Related papers (2021-02-14T05:28:13Z)
Multi-view Graph Learning by Joint Modeling of Consistency and Inconsistency [65.76554214664101]
Graph learning has emerged as a promising technique for multi-view clustering with its ability to learn a unified and robust graph from multiple views. We propose a new multi-view graph learning framework, which for the first time simultaneously models multi-view consistency and multi-view inconsistency in a unified objective function. Experiments on twelve multi-view datasets have demonstrated the robustness and efficiency of the proposed approach.
arXiv Detail & Related papers (2020-08-24T06:11:29Z)
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training [62.73470368851127]
Graph representation learning has emerged as a powerful technique for addressing real-world problems. We design Graph Contrastive Coding -- a self-supervised graph neural network pre-training framework. We conduct experiments on three graph learning tasks and ten graph datasets.
arXiv Detail & Related papers (2020-06-17T16:18:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.