Related papers: Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing

Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing

URL: http://arxiv.org/abs/2103.01009v1
Date: Mon, 1 Mar 2021 13:56:10 GMT
Title: Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing
Authors: Charlie Blake, Vitaly Kurin, Maximilian Igl, Shimon Whiteson
Abstract summary: Recent research has shown that Graph Neural Networks (GNNs) can learn policies for locomotion control that are as effective as a typical multi-layer perceptron (MLP) Results have so far been limited to training on small agents, with the performance of GNNs deteriorating rapidly as the number of sensors and actuators grows. We introduce Snowflake, a GNN training method for high-dimensional continuous control that freezes parameters in parts of the network that suffer from overfitting.
Score: 55.42968877840648
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent research has shown that Graph Neural Networks (GNNs) can learn policies for locomotion control that are as effective as a typical multi-layer perceptron (MLP), with superior transfer and multi-task performance (Wang et al., 2018; Huang et al., 2020). Results have so far been limited to training on small agents, with the performance of GNNs deteriorating rapidly as the number of sensors and actuators grows. A key motivation for the use of GNNs in the supervised learning setting is their applicability to large graphs, but this benefit has not yet been realised for locomotion control. We identify the weakness with a common GNN architecture that causes this poor scaling: overfitting in the MLPs within the network that encode, decode, and propagate messages. To combat this, we introduce Snowflake, a GNN training method for high-dimensional continuous control that freezes parameters in parts of the network that suffer from overfitting. Snowflake significantly boosts the performance of GNNs for locomotion control on large agents, now matching the performance of MLPs, and with superior transfer properties.

Related papers

Grimm: A Plug-and-Play Perturbation Rectifier for Graph Neural Networks Defending against Poisoning Attacks [53.972077392749185]
Recent studies have revealed the vulnerability of graph neural networks (GNNs) to adversarial poisoning attacks on node classification tasks. Here we introduce Grimm, the first plug-and-play defense model.
arXiv Detail & Related papers (2024-12-11T17:17:02Z)
Directly Training Temporal Spiking Neural Network with Sparse Surrogate Gradient [8.516243389583702]
Brain-inspired Spiking Neural Networks (SNNs) have attracted much attention due to their event-based computing and energy-efficient features. We propose Masked Surrogate Gradients (MSGs) to balance the effectiveness of training and the sparseness of the gradient, thereby improving the generalization ability of SNNs.
arXiv Detail & Related papers (2024-06-28T04:21:32Z)
Label Deconvolution for Node Representation Learning on Large-scale Attributed Graphs against Learning Bias [75.44877675117749]
We propose an efficient label regularization technique, namely Label Deconvolution (LD), to alleviate the learning bias by a novel and highly scalable approximation to the inverse mapping of GNNs. Experiments demonstrate LD significantly outperforms state-of-the-art methods on Open Graph datasets Benchmark.
arXiv Detail & Related papers (2023-09-26T13:09:43Z)
LazyGNN: Large-Scale Graph Neural Networks via Lazy Propagation [51.552170474958736]
We propose to capture long-distance dependency in graphs by shallower models instead of deeper models, which leads to a much more efficient model, LazyGNN, for graph representation learning. LazyGNN is compatible with existing scalable approaches (such as sampling methods) for further accelerations through the development of mini-batch LazyGNN. Comprehensive experiments demonstrate its superior prediction performance and scalability on large-scale benchmarks.
arXiv Detail & Related papers (2023-02-03T02:33:07Z)
Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs [71.93227401463199]
This paper pinpoints the major source of GNNs' performance gain to their intrinsic capability, by introducing an intermediate model class dubbed as P(ropagational)MLP. We observe that PMLPs consistently perform on par with (or even exceed) their GNN counterparts, while being much more efficient in training.
arXiv Detail & Related papers (2022-12-18T08:17:32Z)
Addressing Over-Smoothing in Graph Neural Networks via Deep Supervision [13.180922099929765]
Deep graph neural networks (GNNs) suffer from over-smoothing when the number of layers increases. We propose DSGNNs enhanced with deep supervision where representations learned at all layers are used for training. We show that DSGNNs are resilient to over-smoothing and can outperform competitive benchmarks on node and graph property prediction problems.
arXiv Detail & Related papers (2022-02-25T06:05:55Z)
Hybrid Graph Neural Networks for Few-Shot Learning [85.93495480949079]
Graph neural networks (GNNs) have been used to tackle the few-shot learning problem. Under the inductive setting, existing GNN based methods are less competitive. We propose a novel hybrid GNN model consisting of two GNNs, an instance GNN and a prototype GNN.
arXiv Detail & Related papers (2021-12-13T10:20:15Z)
Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation [34.676755383361005]
Graph-less Neural Networks (GLNNs) have no inference graph dependency. We show that GLNNs with competitive performance infer faster than GNNs by 146X-273X and faster than other acceleration methods by 14X-27X. A comprehensive analysis of GLNN shows when and why GLNN can achieve competitive results to Gs and suggests GLNN as a handy choice for latency-constrained applications.
arXiv Detail & Related papers (2021-10-17T05:16:58Z)
Attentive Graph Neural Networks for Few-Shot Learning [74.01069516079379]
Graph Neural Networks (GNN) has demonstrated the superior performance in many challenging applications, including the few-shot learning tasks. Despite its powerful capacity to learn and generalize the model from few samples, GNN usually suffers from severe over-fitting and over-smoothing as the model becomes deep. We propose a novel Attentive GNN to tackle these challenges, by incorporating a triple-attention mechanism.
arXiv Detail & Related papers (2020-07-14T07:43:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.