Online Adversarial Knowledge Distillation for Graph Neural Networks
- URL: http://arxiv.org/abs/2112.13966v2
- Date: Thu, 05 Dec 2024 03:35:33 GMT
- Title: Online Adversarial Knowledge Distillation for Graph Neural Networks
- Authors: Can Wang, Zhe Wang, Defang Chen, Sheng Zhou, Yan Feng, Chun Chen,
- Abstract summary: Knowledge distillation is used to enhance model generalization in Convolutional Neural Networks (CNNs)
In this paper, we propose an online adversarial distillation approach to train a group of graph neural networks.
- Score: 25.902263307225816
- License:
- Abstract: Knowledge distillation, a technique recently gaining popularity for enhancing model generalization in Convolutional Neural Networks (CNNs), operates under the assumption that both teacher and student models are trained on identical data distributions. However, its effect on Graph Neural Networks (GNNs) is less than satisfactory since the graph topology and node attributes are prone to evolve, thereby leading to the issue of distribution shift. In this paper, we tackle this challenge by simultaneously training a group of graph neural networks in an online distillation fashion, where the group knowledge plays a role as a dynamic virtual teacher and the structure changes in graph neural networks are effectively captured. To improve the distillation performance, two types of knowledge are transferred among the students to enhance each other: local knowledge reflecting information in the graph topology and node attributes, and global knowledge reflecting the prediction over classes. We transfer the global knowledge with KL-divergence as the vanilla knowledge distillation does, while exploiting the complicated structure of the local knowledge with an efficient adversarial cyclic learning framework. Extensive experiments verified the effectiveness of our proposed online adversarial distillation approach. The code is published at https://github.com/wangz3066/OnlineDistillGCN.
Related papers
- Graph Neural Networks Provably Benefit from Structural Information: A
Feature Learning Perspective [53.999128831324576]
Graph neural networks (GNNs) have pioneered advancements in graph representation learning.
This study investigates the role of graph convolution within the context of feature learning theory.
arXiv Detail & Related papers (2023-06-24T10:21:11Z) - Semantic Enhanced Knowledge Graph for Large-Scale Zero-Shot Learning [74.6485604326913]
We provide a new semantic enhanced knowledge graph that contains both expert knowledge and categories semantic correlation.
To propagate information on the knowledge graph, we propose a novel Residual Graph Convolutional Network (ResGCN)
Experiments conducted on the widely used large-scale ImageNet-21K dataset and AWA2 dataset show the effectiveness of our method.
arXiv Detail & Related papers (2022-12-26T13:18:36Z) - Geometric Knowledge Distillation: Topology Compression for Graph Neural
Networks [80.8446673089281]
We study a new paradigm of knowledge transfer that aims at encoding graph topological information into graph neural networks (GNNs)
We propose Neural Heat Kernel (NHK) to encapsulate the geometric property of the underlying manifold concerning the architecture of GNNs.
A fundamental and principled solution is derived by aligning NHKs on teacher and student models, dubbed as Geometric Knowledge Distillation.
arXiv Detail & Related papers (2022-10-24T08:01:58Z) - Compressing Deep Graph Neural Networks via Adversarial Knowledge
Distillation [41.00398052556643]
We propose a novel Adversarial Knowledge Distillation framework for graph models named GraphAKD.
The discriminator distinguishes between teacher knowledge and what the student inherits, while the student GNN works as a generator and aims to fool the discriminator.
The results imply that GraphAKD can precisely transfer knowledge from a complicated teacher GNN to a compact student GNN.
arXiv Detail & Related papers (2022-05-24T00:04:43Z) - Alignahead: Online Cross-Layer Knowledge Extraction on Graph Neural
Networks [6.8080936803807734]
Existing knowledge distillation methods on graph neural networks (GNNs) are almost offline.
We propose a novel online knowledge distillation framework to resolve this problem.
We develop a cross-layer distillation strategy by aligning ahead one student layer with the layer in different depth of another student model.
arXiv Detail & Related papers (2022-05-05T06:48:13Z) - Investigating Transfer Learning in Graph Neural Networks [2.320417845168326]
Graph neural networks (GNNs) build on the success of deep learning models by extending them for use in graph spaces.
transfer learning has proven extremely successful for traditional deep learning problems: resulting in faster training and improved performance.
This research demonstrates that transfer learning is effective with GNNs, and describes how source tasks and the choice of GNN impact the ability to learn generalisable knowledge.
arXiv Detail & Related papers (2022-02-01T20:33:15Z) - ROD: Reception-aware Online Distillation for Sparse Graphs [23.55530524584572]
We propose ROD, a novel reception-aware online knowledge distillation approach for sparse graph learning.
We design three supervision signals for ROD: multi-scale reception-aware graph knowledge, task-based supervision, and rich distilled knowledge.
Our approach has been extensively evaluated on 9 datasets and a variety of graph-based tasks.
arXiv Detail & Related papers (2021-07-25T11:55:47Z) - A Heterogeneous Graph with Factual, Temporal and Logical Knowledge for
Question Answering Over Dynamic Contexts [81.4757750425247]
We study question answering over a dynamic textual environment.
We develop a graph neural network over the constructed graph, and train the model in an end-to-end manner.
arXiv Detail & Related papers (2020-04-25T04:53:54Z) - Distilling Knowledge from Graph Convolutional Networks [146.71503336770886]
Existing knowledge distillation methods focus on convolutional neural networks (CNNs)
We propose the first dedicated approach to distilling knowledge from a pre-trained graph convolutional network (GCN) model.
We show that our method achieves the state-of-the-art knowledge distillation performance for GCN models.
arXiv Detail & Related papers (2020-03-23T18:23:11Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.