Related papers: Online Adversarial Distillation for Graph Neural Networks

Online Adversarial Distillation for Graph Neural Networks

URL: http://arxiv.org/abs/2112.13966v1
Date: Tue, 28 Dec 2021 02:30:11 GMT
Title: Online Adversarial Distillation for Graph Neural Networks
Authors: Can Wang, Zhe Wang, Defang Chen, Sheng Zhou, Yan Feng, Chun Chen
Abstract summary: Knowledge distillation is a technique to improve the model generalization ability on convolutional neural networks. In this paper, we propose an online adversarial distillation approach to train a group of graph neural networks.
Score: 40.746598033413086
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Knowledge distillation has recently become a popular technique to improve the model generalization ability on convolutional neural networks. However, its effect on graph neural networks is less than satisfactory since the graph topology and node attributes are likely to change in a dynamic way and in this case a static teacher model is insufficient in guiding student training. In this paper, we tackle this challenge by simultaneously training a group of graph neural networks in an online distillation fashion, where the group knowledge plays a role as a dynamic virtual teacher and the structure changes in graph neural networks are effectively captured. To improve the distillation performance, two types of knowledge are transferred among the students to enhance each other: local knowledge reflecting information in the graph topology and node attributes, and global knowledge reflecting the prediction over classes. We transfer the global knowledge with KL-divergence as the vanilla knowledge distillation does, while exploiting the complicated structure of the local knowledge with an efficient adversarial cyclic learning framework. Extensive experiments verified the effectiveness of our proposed online adversarial distillation approach.

Related papers

Adversarial Curriculum Graph-Free Knowledge Distillation for Graph Neural Networks [61.608453110751206]
We propose a fast and high-quality data-free knowledge distillation approach for graph neural networks. The proposed graph-free KD method (ACGKD) significantly reduces the spatial complexity of pseudo-graphs. ACGKD eliminates the dimensional ambiguity between the student and teacher models by increasing the student's dimensions.
arXiv Detail & Related papers (2025-04-01T08:44:27Z)
Graph Neural Networks Provably Benefit from Structural Information: A Feature Learning Perspective [53.999128831324576]
Graph neural networks (GNNs) have pioneered advancements in graph representation learning. This study investigates the role of graph convolution within the context of feature learning theory.
arXiv Detail & Related papers (2023-06-24T10:21:11Z)
Semantic Enhanced Knowledge Graph for Large-Scale Zero-Shot Learning [74.6485604326913]
We provide a new semantic enhanced knowledge graph that contains both expert knowledge and categories semantic correlation. To propagate information on the knowledge graph, we propose a novel Residual Graph Convolutional Network (ResGCN) Experiments conducted on the widely used large-scale ImageNet-21K dataset and AWA2 dataset show the effectiveness of our method.
arXiv Detail & Related papers (2022-12-26T13:18:36Z)
Geometric Knowledge Distillation: Topology Compression for Graph Neural Networks [80.8446673089281]
We study a new paradigm of knowledge transfer that aims at encoding graph topological information into graph neural networks (GNNs) We propose Neural Heat Kernel (NHK) to encapsulate the geometric property of the underlying manifold concerning the architecture of GNNs. A fundamental and principled solution is derived by aligning NHKs on teacher and student models, dubbed as Geometric Knowledge Distillation.
arXiv Detail & Related papers (2022-10-24T08:01:58Z)
Dynamic Community Detection via Adversarial Temporal Graph Representation Learning [17.487265170798974]
In this work, an adversarial temporal graph representation learning framework is proposed to detect dynamic communities from a small sample of brain network data. In addition, the framework employs adversarial training to guide the learning of temporal graph representation and optimize the measurable modularity loss to maximize the modularity of community.
arXiv Detail & Related papers (2022-06-29T08:44:22Z)
Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation [41.00398052556643]
We propose a novel Adversarial Knowledge Distillation framework for graph models named GraphAKD. The discriminator distinguishes between teacher knowledge and what the student inherits, while the student GNN works as a generator and aims to fool the discriminator. The results imply that GraphAKD can precisely transfer knowledge from a complicated teacher GNN to a compact student GNN.
arXiv Detail & Related papers (2022-05-24T00:04:43Z)
Alignahead: Online Cross-Layer Knowledge Extraction on Graph Neural Networks [6.8080936803807734]
Existing knowledge distillation methods on graph neural networks (GNNs) are almost offline. We propose a novel online knowledge distillation framework to resolve this problem. We develop a cross-layer distillation strategy by aligning ahead one student layer with the layer in different depth of another student model.
arXiv Detail & Related papers (2022-05-05T06:48:13Z)
Investigating Transfer Learning in Graph Neural Networks [2.320417845168326]
Graph neural networks (GNNs) build on the success of deep learning models by extending them for use in graph spaces. transfer learning has proven extremely successful for traditional deep learning problems: resulting in faster training and improved performance. This research demonstrates that transfer learning is effective with GNNs, and describes how source tasks and the choice of GNN impact the ability to learn generalisable knowledge.
arXiv Detail & Related papers (2022-02-01T20:33:15Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Being Friends Instead of Adversaries: Deep Networks Learn from Data Simplified by Other Networks [23.886422706697882]
A different idea has been recently proposed, named Friendly Training, which consists in altering the input data by adding an automatically estimated perturbation. We revisit and extend this idea inspired by the effectiveness of neural generators in the context of Adversarial Machine Learning. We propose an auxiliary multi-layer network that is responsible of altering the input data to make them easier to be handled by the classifier.
arXiv Detail & Related papers (2021-12-18T16:59:35Z)
Learning through structure: towards deep neuromorphic knowledge graph embeddings [0.5906031288935515]
We propose a strategy to map deep graph learning architectures for knowledge graph reasoning to neuromorphic architectures. Based on the insight that randomly and untrained graph neural networks are able to preserve local graph structures, we compose a frozen neural network shallow knowledge graph embedding models. We experimentally show that already on conventional computing hardware, this leads to a significant speedup and memory reduction while maintaining a competitive performance level.
arXiv Detail & Related papers (2021-09-21T18:01:04Z)
ROD: Reception-aware Online Distillation for Sparse Graphs [23.55530524584572]
We propose ROD, a novel reception-aware online knowledge distillation approach for sparse graph learning. We design three supervision signals for ROD: multi-scale reception-aware graph knowledge, task-based supervision, and rich distilled knowledge. Our approach has been extensively evaluated on 9 datasets and a variety of graph-based tasks.
arXiv Detail & Related papers (2021-07-25T11:55:47Z)
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher [40.74624021934218]
Knowledge distillation is a strategy of training a student network with guide of the soft output from a teacher network. Recent finding on neural tangent kernel enables us to approximate a wide neural network with a linear model of the network's random features.
arXiv Detail & Related papers (2020-10-20T07:33:21Z)
A Heterogeneous Graph with Factual, Temporal and Logical Knowledge for Question Answering Over Dynamic Contexts [81.4757750425247]
We study question answering over a dynamic textual environment. We develop a graph neural network over the constructed graph, and train the model in an end-to-end manner.
arXiv Detail & Related papers (2020-04-25T04:53:54Z)
Distilling Knowledge from Graph Convolutional Networks [146.71503336770886]
Existing knowledge distillation methods focus on convolutional neural networks (CNNs) We propose the first dedicated approach to distilling knowledge from a pre-trained graph convolutional network (GCN) model. We show that our method achieves the state-of-the-art knowledge distillation performance for GCN models.
arXiv Detail & Related papers (2020-03-23T18:23:11Z)
Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation. We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters. As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.