Related papers: Frameless Graph Knowledge Distillation

Frameless Graph Knowledge Distillation

URL: http://arxiv.org/abs/2307.06631v1
Date: Thu, 13 Jul 2023 08:56:50 GMT
Title: Frameless Graph Knowledge Distillation
Authors: Dai Shi, Zhiqi Shao, Yi Guo, Junbin Gao
Abstract summary: We show how the graph knowledge supplied by the teacher is learned and digested by the student model via both algebra and geometry. Our proposed model can generate learning accuracy identical to or even surpass the teacher model while maintaining the high speed of inference.
Score: 27.831929635701886
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Knowledge distillation (KD) has shown great potential for transferring knowledge from a complex teacher model to a simple student model in which the heavy learning task can be accomplished efficiently and without losing too much prediction accuracy. Recently, many attempts have been made by applying the KD mechanism to the graph representation learning models such as graph neural networks (GNNs) to accelerate the model's inference speed via student models. However, many existing KD-based GNNs utilize MLP as a universal approximator in the student model to imitate the teacher model's process without considering the graph knowledge from the teacher model. In this work, we provide a KD-based framework on multi-scaled GNNs, known as graph framelet, and prove that by adequately utilizing the graph knowledge in a multi-scaled manner provided by graph framelet decomposition, the student model is capable of adapting both homophilic and heterophilic graphs and has the potential of alleviating the over-squashing issue with a simple yet effectively graph surgery. Furthermore, we show how the graph knowledge supplied by the teacher is learned and digested by the student model via both algebra and geometry. Comprehensive experiments show that our proposed model can generate learning accuracy identical to or even surpass the teacher model while maintaining the high speed of inference.

Related papers

Adversarial Curriculum Graph-Free Knowledge Distillation for Graph Neural Networks [61.608453110751206]
We propose a fast and high-quality data-free knowledge distillation approach for graph neural networks. The proposed graph-free KD method (ACGKD) significantly reduces the spatial complexity of pseudo-graphs. ACGKD eliminates the dimensional ambiguity between the student and teacher models by increasing the student's dimensions.
arXiv Detail & Related papers (2025-04-01T08:44:27Z)
Disentangled Generative Graph Representation Learning [51.59824683232925]
This paper introduces DiGGR (Disentangled Generative Graph Representation Learning), a self-supervised learning framework. It aims to learn latent disentangled factors and utilize them to guide graph mask modeling. Experiments on 11 public datasets for two different graph learning tasks demonstrate that DiGGR consistently outperforms many previous self-supervised methods.
arXiv Detail & Related papers (2024-08-24T05:13:02Z)
Distilling Knowledge from Self-Supervised Teacher by Embedding Graph Alignment [52.704331909850026]
We formulate a new knowledge distillation framework to transfer the knowledge from self-supervised pre-trained models to any other student network. Inspired by the spirit of instance discrimination in self-supervised learning, we model the instance-instance relations by a graph formulation in the feature embedding space. Our distillation scheme can be flexibly applied to transfer the self-supervised knowledge to enhance representation learning on various student networks.
arXiv Detail & Related papers (2022-11-23T19:27:48Z)
Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation [41.00398052556643]
We propose a novel Adversarial Knowledge Distillation framework for graph models named GraphAKD. The discriminator distinguishes between teacher knowledge and what the student inherits, while the student GNN works as a generator and aims to fool the discriminator. The results imply that GraphAKD can precisely transfer knowledge from a complicated teacher GNN to a compact student GNN.
arXiv Detail & Related papers (2022-05-24T00:04:43Z)
Data-Free Adversarial Knowledge Distillation for Graph Neural Networks [62.71646916191515]
We propose the first end-to-end framework for data-free adversarial knowledge distillation on graph structured data (DFAD-GNN) To be specific, our DFAD-GNN employs a generative adversarial network, which mainly consists of three components: a pre-trained teacher model and a student model are regarded as two discriminators, and a generator is utilized for deriving training graphs to distill knowledge from the teacher model into the student model. Our DFAD-GNN significantly surpasses state-of-the-art data-free baselines in the graph classification task.
arXiv Detail & Related papers (2022-05-08T08:19:40Z)
Graph Self-supervised Learning with Accurate Discrepancy Learning [64.69095775258164]
We propose a framework that aims to learn the exact discrepancy between the original and the perturbed graphs, coined as Discrepancy-based Self-supervised LeArning (D-SLA) We validate our method on various graph-related downstream tasks, including molecular property prediction, protein function prediction, and link prediction tasks, on which our model largely outperforms relevant baselines.
arXiv Detail & Related papers (2022-02-07T08:04:59Z)
Extract the Knowledge of Graph Neural Networks and Go Beyond it: An Effective Knowledge Distillation Framework [42.57467126227328]
We propose a framework based on knowledge distillation to address the issues of semi-supervised learning on graphs. Our framework extracts the knowledge of an arbitrary learned GNN model (teacher model) and injects it into a well-designed student model. Experimental results show that the learned student model can consistently outperform its corresponding teacher model by 1.4% - 4.7% on average.
arXiv Detail & Related papers (2021-03-04T08:13:55Z)
Tucker decomposition-based Temporal Knowledge Graph Completion [35.56360622521721]
We build a new tensor decomposition model for temporal knowledge graphs completion inspired by the Tucker decomposition of order 4 tensor. We demonstrate that the proposed model is fully expressive and report state-of-the-art results for several public benchmarks.
arXiv Detail & Related papers (2020-11-16T07:05:52Z)
Iterative Graph Self-Distillation [161.04351580382078]
We propose a novel unsupervised graph learning paradigm called Iterative Graph Self-Distillation (IGSD) IGSD iteratively performs the teacher-student distillation with graph augmentations. We show that we achieve significant and consistent performance gain on various graph datasets in both unsupervised and semi-supervised settings.
arXiv Detail & Related papers (2020-10-23T18:37:06Z)
Distilling Knowledge from Graph Convolutional Networks [146.71503336770886]
Existing knowledge distillation methods focus on convolutional neural networks (CNNs) We propose the first dedicated approach to distilling knowledge from a pre-trained graph convolutional network (GCN) model. We show that our method achieves the state-of-the-art knowledge distillation performance for GCN models.
arXiv Detail & Related papers (2020-03-23T18:23:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.