Online Adversarial Distillation for Graph Neural Networks
- URL: http://arxiv.org/abs/2112.13966v1
- Date: Tue, 28 Dec 2021 02:30:11 GMT
- Title: Online Adversarial Distillation for Graph Neural Networks
- Authors: Can Wang, Zhe Wang, Defang Chen, Sheng Zhou, Yan Feng, Chun Chen
- Abstract summary: Knowledge distillation is a technique to improve the model generalization ability on convolutional neural networks.
In this paper, we propose an online adversarial distillation approach to train a group of graph neural networks.
- Score: 40.746598033413086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation has recently become a popular technique to improve the
model generalization ability on convolutional neural networks. However, its
effect on graph neural networks is less than satisfactory since the graph
topology and node attributes are likely to change in a dynamic way and in this
case a static teacher model is insufficient in guiding student training. In
this paper, we tackle this challenge by simultaneously training a group of
graph neural networks in an online distillation fashion, where the group
knowledge plays a role as a dynamic virtual teacher and the structure changes
in graph neural networks are effectively captured. To improve the distillation
performance, two types of knowledge are transferred among the students to
enhance each other: local knowledge reflecting information in the graph
topology and node attributes, and global knowledge reflecting the prediction
over classes. We transfer the global knowledge with KL-divergence as the
vanilla knowledge distillation does, while exploiting the complicated structure
of the local knowledge with an efficient adversarial cyclic learning framework.
Extensive experiments verified the effectiveness of our proposed online
adversarial distillation approach.
Related papers
- Graph Neural Networks Provably Benefit from Structural Information: A
Feature Learning Perspective [53.999128831324576]
Graph neural networks (GNNs) have pioneered advancements in graph representation learning.
This study investigates the role of graph convolution within the context of feature learning theory.
arXiv Detail & Related papers (2023-06-24T10:21:11Z) - Semantic Enhanced Knowledge Graph for Large-Scale Zero-Shot Learning [74.6485604326913]
We provide a new semantic enhanced knowledge graph that contains both expert knowledge and categories semantic correlation.
To propagate information on the knowledge graph, we propose a novel Residual Graph Convolutional Network (ResGCN)
Experiments conducted on the widely used large-scale ImageNet-21K dataset and AWA2 dataset show the effectiveness of our method.
arXiv Detail & Related papers (2022-12-26T13:18:36Z) - Geometric Knowledge Distillation: Topology Compression for Graph Neural
Networks [80.8446673089281]
We study a new paradigm of knowledge transfer that aims at encoding graph topological information into graph neural networks (GNNs)
We propose Neural Heat Kernel (NHK) to encapsulate the geometric property of the underlying manifold concerning the architecture of GNNs.
A fundamental and principled solution is derived by aligning NHKs on teacher and student models, dubbed as Geometric Knowledge Distillation.
arXiv Detail & Related papers (2022-10-24T08:01:58Z) - Compressing Deep Graph Neural Networks via Adversarial Knowledge
Distillation [41.00398052556643]
We propose a novel Adversarial Knowledge Distillation framework for graph models named GraphAKD.
The discriminator distinguishes between teacher knowledge and what the student inherits, while the student GNN works as a generator and aims to fool the discriminator.
The results imply that GraphAKD can precisely transfer knowledge from a complicated teacher GNN to a compact student GNN.
arXiv Detail & Related papers (2022-05-24T00:04:43Z) - Alignahead: Online Cross-Layer Knowledge Extraction on Graph Neural
Networks [6.8080936803807734]
Existing knowledge distillation methods on graph neural networks (GNNs) are almost offline.
We propose a novel online knowledge distillation framework to resolve this problem.
We develop a cross-layer distillation strategy by aligning ahead one student layer with the layer in different depth of another student model.
arXiv Detail & Related papers (2022-05-05T06:48:13Z) - Investigating Transfer Learning in Graph Neural Networks [2.320417845168326]
Graph neural networks (GNNs) build on the success of deep learning models by extending them for use in graph spaces.
transfer learning has proven extremely successful for traditional deep learning problems: resulting in faster training and improved performance.
This research demonstrates that transfer learning is effective with GNNs, and describes how source tasks and the choice of GNN impact the ability to learn generalisable knowledge.
arXiv Detail & Related papers (2022-02-01T20:33:15Z) - ROD: Reception-aware Online Distillation for Sparse Graphs [23.55530524584572]
We propose ROD, a novel reception-aware online knowledge distillation approach for sparse graph learning.
We design three supervision signals for ROD: multi-scale reception-aware graph knowledge, task-based supervision, and rich distilled knowledge.
Our approach has been extensively evaluated on 9 datasets and a variety of graph-based tasks.
arXiv Detail & Related papers (2021-07-25T11:55:47Z) - A Heterogeneous Graph with Factual, Temporal and Logical Knowledge for
Question Answering Over Dynamic Contexts [81.4757750425247]
We study question answering over a dynamic textual environment.
We develop a graph neural network over the constructed graph, and train the model in an end-to-end manner.
arXiv Detail & Related papers (2020-04-25T04:53:54Z) - Distilling Knowledge from Graph Convolutional Networks [146.71503336770886]
Existing knowledge distillation methods focus on convolutional neural networks (CNNs)
We propose the first dedicated approach to distilling knowledge from a pre-trained graph convolutional network (GCN) model.
We show that our method achieves the state-of-the-art knowledge distillation performance for GCN models.
arXiv Detail & Related papers (2020-03-23T18:23:11Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.