Unveiling the Unseen Potential of Graph Learning through MLPs: Effective
Graph Learners Using Propagation-Embracing MLPs
- URL: http://arxiv.org/abs/2311.11759v1
- Date: Mon, 20 Nov 2023 13:39:19 GMT
- Title: Unveiling the Unseen Potential of Graph Learning through MLPs: Effective
Graph Learners Using Propagation-Embracing MLPs
- Authors: Yong-Min Shin, Won-Yong Shin
- Abstract summary: We train a student by knowledge distillation from a teacher neural network (GNN)
Inspired by GNNs that separate transformation $T$ and propagation $Pi$, we re-frame the KD process as enabling the student to explicitly learn both $T$ and $Pi$.
We propose Propagate & Distill (P&D), which propagates the output of the teacher GNN before KD and can be interpreted as an approximate process of the inverse propagation $Pi-1$.
- Score: 9.731314045194495
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent studies attempted to utilize multilayer perceptrons (MLPs) to solve
semi-supervised node classification on graphs, by training a student MLP by
knowledge distillation (KD) from a teacher graph neural network (GNN). While
previous studies have focused mostly on training the student MLP by matching
the output probability distributions between the teacher and student models
during KD, it has not been systematically studied how to inject the structural
information in an explicit and interpretable manner. Inspired by GNNs that
separate feature transformation $T$ and propagation $\Pi$, we re-frame the KD
process as enabling the student MLP to explicitly learn both $T$ and $\Pi$.
Although this can be achieved by applying the inverse propagation $\Pi^{-1}$
before distillation from the teacher GNN, it still comes with a high
computational cost from large matrix multiplications during training. To solve
this problem, we propose Propagate & Distill (P&D), which propagates the output
of the teacher GNN before KD and can be interpreted as an approximate process
of the inverse propagation $\Pi^{-1}$. Through comprehensive evaluations using
real-world benchmark datasets, we demonstrate the effectiveness of P&D by
showing further performance boost of the student MLP.
Related papers
- Heuristic Methods are Good Teachers to Distill MLPs for Graph Link Prediction [61.70012924088756]
Distilling Graph Neural Networks (GNNs) teachers into Multi-Layer Perceptrons (MLPs) students has emerged as an effective approach to achieve strong performance.
However, existing distillation methods only use standard GNNs and overlook alternative teachers such as specialized model for link prediction (GNN4LP) and methods (e.g., common neighbors)
This paper first explores the impact of different teachers in GNN-to-MLP distillation, we find that stronger teachers do not always produce stronger students, while weaker methods can teachs to near-GNN performance with drastically reduced training costs
arXiv Detail & Related papers (2025-04-08T16:35:11Z) - Teach Harder, Learn Poorer: Rethinking Hard Sample Distillation for GNN-to-MLP Knowledge Distillation [56.912354708167534]
Graph Neural Networks (GNNs) and lightweight Multi-Layer Perceptron (MLPs)
GNNto-MLP Knowledge Distillation (KD) proposes to distill knowledge from a well-trained teacher GNN into a student.
This paper proposes a simple yet effective Hardness-aware GNN-to-MLP Distillation (HGMD) framework.
arXiv Detail & Related papers (2024-07-20T06:13:00Z) - A Teacher-Free Graph Knowledge Distillation Framework with Dual
Self-Distillation [58.813991312803246]
We propose a Teacher-Free Graph Self-Distillation (TGS) framework that does not require any teacher model or GNNs during both training and inference.
TGS enjoys the benefits of graph topology awareness in training but is free from data dependency in inference.
arXiv Detail & Related papers (2024-03-06T05:52:13Z) - Propagate & Distill: Towards Effective Graph Learners Using
Propagation-Embracing MLPs [9.731314045194495]
We train a student by knowledge distillation from a teacher graph neural network (GNN)
Inspired by GNNs that separate feature transformation $T$, we re-frame the distillation process as making the student learn both $T$ and $Pi$.
We propose Propagate & Distill (P&D), which propagates the output of the teacher before distillation, which can be interpreted as an approximate process of inverse propagation.
arXiv Detail & Related papers (2023-11-29T16:26:24Z) - VQGraph: Rethinking Graph Representation Space for Bridging GNNs and
MLPs [97.63412451659826]
VQGraph learns a structure-aware tokenizer on graph data that can encode each node's local substructure as a discrete code.
VQGraph achieves new state-of-the-art performance on GNN-to-MLP distillation in both transductive and inductive settings.
arXiv Detail & Related papers (2023-08-04T02:58:08Z) - Graph Neural Networks Provably Benefit from Structural Information: A
Feature Learning Perspective [53.999128831324576]
Graph neural networks (GNNs) have pioneered advancements in graph representation learning.
This study investigates the role of graph convolution within the context of feature learning theory.
arXiv Detail & Related papers (2023-06-24T10:21:11Z) - Edge-free but Structure-aware: Prototype-Guided Knowledge Distillation
from GNNs to MLPs [22.541655587228203]
Distilling high-accuracy Graph Neural Networks(GNNs) to low-latency multilayer perceptrons(MLPs) on graph tasks has become a hot research topic.
We propose a Prototype-Guided Knowledge Distillation(PGKD) method, which does not require graph edges(edge-free) yet learns structure-awares.
arXiv Detail & Related papers (2023-03-24T02:28:55Z) - SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP [46.52398427166938]
One promising inference acceleration direction is to distill the GNNs into message-passing-free student multi-layer perceptrons.
We introduce a novel structure-mixing knowledge strategy to enhance the learning ability of students for structure information.
Our SA-MLP can consistently outperform the teacher GNNs, while maintaining faster inference assitance.
arXiv Detail & Related papers (2022-10-18T05:55:36Z) - MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP
Initialization [51.76758674012744]
Training graph neural networks (GNNs) on large graphs is complex and extremely time consuming.
We propose an embarrassingly simple, yet hugely effective method for GNN training acceleration, called PeerInit.
arXiv Detail & Related papers (2022-09-30T21:33:51Z) - Bootstrapped Representation Learning on Graphs [37.62546075583656]
Current state-of-the-art self-supervised learning methods for graph neural networks (GNNs) are based on contrastive learning.
Inspired by BYOL, we present Bootstrapped Graph Latents, BGRL, a self-supervised graph representation method.
BGRL outperforms or matches the previous unsupervised state-of-the-art results on several established benchmark datasets.
arXiv Detail & Related papers (2021-02-12T13:36:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.