Related papers: DA-MoE: Addressing Depth-Sensitivity in Graph-Level Analysis through Mixture of Experts

DA-MoE: Addressing Depth-Sensitivity in Graph-Level Analysis through Mixture of Experts

URL: http://arxiv.org/abs/2411.03025v1
Date: Tue, 05 Nov 2024 11:46:27 GMT
Title: DA-MoE: Addressing Depth-Sensitivity in Graph-Level Analysis through Mixture of Experts
Authors: Zelin Yao, Chuang Liu, Xianke Meng, Yibing Zhan, Jia Wu, Shirui Pan, Wenbin Hu,
Abstract summary: Graph neural networks (GNNs) are gaining popularity for processing graph-structured data. Existing methods generally use a fixed number of GNN layers to generate representations for all graphs. We propose the depth adaptive mixture of expert (DA-MoE) method, which incorporates two main improvements to GNN.
Score: 70.21017141742763
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Graph neural networks (GNNs) are gaining popularity for processing graph-structured data. In real-world scenarios, graph data within the same dataset can vary significantly in scale. This variability leads to depth-sensitivity, where the optimal depth of GNN layers depends on the scale of the graph data. Empirically, fewer layers are sufficient for message passing in smaller graphs, while larger graphs typically require deeper networks to capture long-range dependencies and global features. However, existing methods generally use a fixed number of GNN layers to generate representations for all graphs, overlooking the depth-sensitivity issue in graph structure data. To address this challenge, we propose the depth adaptive mixture of expert (DA-MoE) method, which incorporates two main improvements to GNN backbone: \textbf{1)} DA-MoE employs different GNN layers, each considered an expert with its own parameters. Such a design allows the model to flexibly aggregate information at different scales, effectively addressing the depth-sensitivity issue in graph data. \textbf{2)} DA-MoE utilizes GNN to capture the structural information instead of the linear projections in the gating network. Thus, the gating network enables the model to capture complex patterns and dependencies within the data. By leveraging these improvements, each expert in DA-MoE specifically learns distinct graph patterns at different scales. Furthermore, comprehensive experiments on the TU dataset and open graph benchmark (OGB) have shown that DA-MoE consistently surpasses existing baselines on various tasks, including graph, node, and link-level analyses. The code are available at \url{https://github.com/Celin-Yao/DA-MoE}.

Related papers

Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs) This framework provides a standardized setting to evaluate GNNs across diverse datasets. We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z)
Mixture of Experts Meets Decoupled Message Passing: Towards General and Adaptive Node Classification [4.129489934631072]
Graph neural networks excel at graph representation learning but struggle with heterophilous data and long-range dependencies. We propose GNNMoE, a universal model architecture for node classification. We show that GNNMoE performs exceptionally well across various types of graph data, effectively alleviating the over-smoothing issue and global noise.
arXiv Detail & Related papers (2024-12-11T08:35:13Z)
Tackling Oversmoothing in GNN via Graph Sparsification: A Truss-based Approach [1.4854797901022863]
We propose a novel and flexible truss-based graph sparsification model that prunes edges from dense regions of the graph. We then utilize our sparsification model in the state-of-the-art baseline GNNs and pooling models, such as GIN, SAGPool, GMT, DiffPool, MinCutPool, HGP-SL, DMonPool, and AdamGNN.
arXiv Detail & Related papers (2024-07-16T17:21:36Z)
Spectral Greedy Coresets for Graph Neural Networks [61.24300262316091]
The ubiquity of large-scale graphs in node-classification tasks hinders the real-world applications of Graph Neural Networks (GNNs) This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs based on their spectral embeddings. Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs.
arXiv Detail & Related papers (2024-05-27T17:52:12Z)
SPGNN: Recognizing Salient Subgraph Patterns via Enhanced Graph Convolution and Pooling [25.555741218526464]
Graph neural networks (GNNs) have revolutionized the field of machine learning on non-Euclidean data such as graphs and networks. We propose a concatenation-based graph convolution mechanism that injectively updates node representations. We also design a novel graph pooling module, called WL-SortPool, to learn important subgraph patterns in a deep-learning manner.
arXiv Detail & Related papers (2024-04-21T13:11:59Z)
Graph Mixture of Experts: Learning on Large-Scale Graphs with Explicit Diversity Modeling [60.0185734837814]
Graph neural networks (GNNs) have found extensive applications in learning from graph data. To bolster the generalization capacity of GNNs, it has become customary to augment training graph structures with techniques like graph augmentations. This study introduces the concept of Mixture-of-Experts (MoE) to GNNs, with the aim of augmenting their capacity to adapt to a diverse range of training graph structures.
arXiv Detail & Related papers (2023-04-06T01:09:36Z)
A Robust Stacking Framework for Training Deep Graph Models with Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data. The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN. Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z)
Simplifying approach to Node Classification in Graph Neural Networks [7.057970273958933]
We decouple the node feature aggregation step and depth of graph neural network, and empirically analyze how different aggregated features play a role in prediction performance. We show that not all features generated via aggregation steps are useful, and often using these less informative features can be detrimental to the performance of the GNN model. We present a simple and shallow model, Feature Selection Graph Neural Network (FSGNN), and show empirically that the proposed model achieves comparable or even higher accuracy than state-of-the-art GNN models.
arXiv Detail & Related papers (2021-11-12T14:53:22Z)
Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT) GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information. We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z)
Improving Graph Neural Networks with Simple Architecture Design [7.057970273958933]
We introduce several key design strategies for graph neural networks. We present a simple and shallow model, Feature Selection Graph Neural Network (FSGNN) We show that the proposed model outperforms other state of the art GNN models and achieves up to 64% improvements in accuracy on node classification tasks.
arXiv Detail & Related papers (2021-05-17T06:46:01Z)
Robust Optimization as Data Augmentation for Large-scale Graphs [117.2376815614148]
We propose FLAG (Free Large-scale Adversarial Augmentation on Graphs), which iteratively augments node features with gradient-based adversarial perturbations during training. FLAG is a general-purpose approach for graph data, which universally works in node classification, link prediction, and graph classification tasks.
arXiv Detail & Related papers (2020-10-19T21:51:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.