UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs
- URL: http://arxiv.org/abs/2502.00806v1
- Date: Sun, 02 Feb 2025 14:04:53 GMT
- Title: UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs
- Authors: Yufei He, Yuan Sui, Xiaoxin He, Yue Liu, Yifei Sun, Bryan Hooi,
- Abstract summary: We propose a novel cross-domain graph foundation model that enables general representation learning on multimodal graphs.
UniGraph2 employs modality-specific encoders alongside a graph neural network (GNN) to learn a unified low-dimensional embedding space.
We show that UniGraph2 significantly outperforms state-of-the-art models in tasks such as representation learning, transfer learning, and multimodal generative tasks.
- Score: 34.48393396390799
- License:
- Abstract: Existing foundation models, such as CLIP, aim to learn a unified embedding space for multimodal data, enabling a wide range of downstream web-based applications like search, recommendation, and content classification. However, these models often overlook the inherent graph structures in multimodal datasets, where entities and their relationships are crucial. Multimodal graphs (MMGs) represent such graphs where each node is associated with features from different modalities, while the edges capture the relationships between these entities. On the other hand, existing graph foundation models primarily focus on text-attributed graphs (TAGs) and are not designed to handle the complexities of MMGs. To address these limitations, we propose UniGraph2, a novel cross-domain graph foundation model that enables general representation learning on MMGs, providing a unified embedding space. UniGraph2 employs modality-specific encoders alongside a graph neural network (GNN) to learn a unified low-dimensional embedding space that captures both the multimodal information and the underlying graph structure. We propose a new cross-domain multi-graph pre-training algorithm at scale to ensure effective transfer learning across diverse graph domains and modalities. Additionally, we adopt a Mixture of Experts (MoE) component to align features from different domains and modalities, ensuring coherent and robust embeddings that unify the information across modalities. Extensive experiments on a variety of multimodal graph tasks demonstrate that UniGraph2 significantly outperforms state-of-the-art models in tasks such as representation learning, transfer learning, and multimodal generative tasks, offering a scalable and flexible solution for learning on MMGs.
Related papers
- Dual-level Mixup for Graph Few-shot Learning with Fewer Tasks [23.07584018576066]
We propose a SiMple yet effectIve approach for graph few-shot Learning with fEwer tasks, named SMILE.
We introduce a dual-level mixup strategy, encompassing both within-task and across-task mixup, to simultaneously enrich the available nodes and tasks in meta-learning.
Empirically, SMILE consistently outperforms other competitive models by a large margin across all evaluated datasets with in-domain and cross-domain settings.
arXiv Detail & Related papers (2025-02-19T23:59:05Z) - Multi-view Fuzzy Graph Attention Networks for Enhanced Graph Learning [0.0]
Fuzzy Graph Attention Network (FGAT) has shown promise in tasks requiring robust graph-based learning.
We propose the Multi-view Fuzzy Graph Attention Network (MFGAT), a novel framework that constructs and aggregates multi-view information.
arXiv Detail & Related papers (2024-12-23T04:39:08Z) - One Model for One Graph: A New Perspective for Pretraining with Cross-domain Graphs [61.9759512646523]
Graph Neural Networks (GNNs) have emerged as a powerful tool to capture intricate network patterns.
Existing GNNs require careful domain-specific architecture designs and training from scratch on each dataset.
We propose a novel cross-domain pretraining framework, "one model for one graph"
arXiv Detail & Related papers (2024-11-30T01:49:45Z) - GraphFM: A Scalable Framework for Multi-Graph Pretraining [2.882104808886318]
We introduce a scalable multi-graph multi-task pretraining approach specifically tailored for node classification tasks across diverse graph datasets from different domains.
We demonstrate the efficacy of our approach by training a model on 152 different graph datasets comprising over 7.4 million nodes and 189 million edges.
Our results show that pretraining on a diverse array of real and synthetic graphs improves the model's adaptability and stability, while performing competitively with state-of-the-art specialist models.
arXiv Detail & Related papers (2024-07-16T16:51:43Z) - Multimodal Graph Benchmark [36.75510196380185]
Multimodal Graph Benchmark (MM-GRAPH) is first comprehensive multi-modal graph benchmark that incorporates both textual and visual information.
MM-GRAPH consists of five graph learning datasets of various scales that are appropriate for different learning tasks.
MM-GRAPH aims to foster research on multimodal graph learning and drive the development of more advanced and robust graph learning algorithms.
arXiv Detail & Related papers (2024-06-24T05:14:09Z) - UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs [30.635472655668078]
Text-Attributed Graphs (TAGs) can generalize to unseen graphs and tasks across diverse domains.
We propose a novel cascaded architecture of Language Models (LMs) and Graph Neural Networks (GNNs) as backbone networks.
We demonstrate the model's effectiveness in self-supervised representation learning on unseen graphs, few-shot in-context transfer, and zero-shot transfer.
arXiv Detail & Related papers (2024-02-21T09:06:31Z) - Model-Agnostic Graph Regularization for Few-Shot Learning [60.64531995451357]
We present a comprehensive study on graph embedded few-shot learning.
We introduce a graph regularization approach that allows a deeper understanding of the impact of incorporating graph information between labels.
Our approach improves the performance of strong base learners by up to 2% on Mini-ImageNet and 6.7% on ImageNet-FS.
arXiv Detail & Related papers (2021-02-14T05:28:13Z) - Graphonomy: Universal Image Parsing via Graph Reasoning and Transfer [140.72439827136085]
We propose a graph reasoning and transfer learning framework named "Graphonomy"
It incorporates human knowledge and label taxonomy into the intermediate graph representation learning beyond local convolutions.
It learns the global and structured semantic coherency in multiple domains via semantic-aware graph reasoning and transfer.
arXiv Detail & Related papers (2021-01-26T08:19:03Z) - Multi-Level Graph Convolutional Network with Automatic Graph Learning
for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification.
By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions.
Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z) - Tensor Graph Convolutional Networks for Multi-relational and Robust
Learning [74.05478502080658]
This paper introduces a tensor-graph convolutional network (TGCN) for scalable semi-supervised learning (SSL) from data associated with a collection of graphs, that are represented by a tensor.
The proposed architecture achieves markedly improved performance relative to standard GCNs, copes with state-of-the-art adversarial attacks, and leads to remarkable SSL performance over protein-to-protein interaction networks.
arXiv Detail & Related papers (2020-03-15T02:33:21Z) - Graph Representation Learning via Graphical Mutual Information
Maximization [86.32278001019854]
We propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations.
We develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder.
arXiv Detail & Related papers (2020-02-04T08:33:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.