Graph Neural Networks for Learning Equivariant Representations of Neural Networks
- URL: http://arxiv.org/abs/2403.12143v3
- Date: Tue, 23 Jul 2024 16:30:10 GMT
- Title: Graph Neural Networks for Learning Equivariant Representations of Neural Networks
- Authors: Miltiadis Kofinas, Boris Knyazev, Yan Zhang, Yunlu Chen, Gertjan J. Burghouts, Efstratios Gavves, Cees G. M. Snoek, David W. Zhang,
- Abstract summary: We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
- Score: 55.04145324152541
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks that process the parameters of other neural networks find applications in domains as diverse as classifying implicit neural representations, generating neural network weights, and predicting generalization errors. However, existing approaches either overlook the inherent permutation symmetry in the neural network or rely on intricate weight-sharing patterns to achieve equivariance, while ignoring the impact of the network architecture itself. In this work, we propose to represent neural networks as computational graphs of parameters, which allows us to harness powerful graph neural networks and transformers that preserve permutation symmetry. Consequently, our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations, predicting generalization performance, and learning to optimize, while consistently outperforming state-of-the-art methods. The source code is open-sourced at https://github.com/mkofinas/neural-graphs.
Related papers
- Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Graph Metanetworks for Processing Diverse Neural Architectures [33.686728709734105]
Graph Metanetworks (GMNs) generalizes to neural architectures where competing methods struggle.
We prove that GMNs are expressive and equivariant to parameter permutation symmetries that leave the input neural network functions.
arXiv Detail & Related papers (2023-12-07T18:21:52Z) - Generalizable Neural Fields as Partially Observed Neural Processes [16.202109517569145]
We propose a new paradigm that views the large-scale training of neural representations as a part of a partially-observed neural process framework.
We demonstrate that this approach outperforms both state-of-the-art gradient-based meta-learning approaches and hypernetwork approaches.
arXiv Detail & Related papers (2023-09-13T01:22:16Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Gaussian Process Surrogate Models for Neural Networks [6.8304779077042515]
In science and engineering, modeling is a methodology used to understand complex systems whose internal processes are opaque.
We construct a class of surrogate models for neural networks using Gaussian processes.
We demonstrate our approach captures existing phenomena related to the spectral bias of neural networks, and then show that our surrogate models can be used to solve practical problems.
arXiv Detail & Related papers (2022-08-11T20:17:02Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Stochastic Neural Networks with Infinite Width are Deterministic [7.07065078444922]
We study neural networks, a main type of neural network in use.
We prove that as the width of an optimized neural network tends to infinity, its predictive variance on the training set decreases to zero.
arXiv Detail & Related papers (2022-01-30T04:52:31Z) - E(n) Equivariant Graph Neural Networks [86.75170631724548]
This paper introduces a new model to learn graph neural networks equivariant to rotations, translations, reflections and permutations called E(n)-Equivariant Graph Neural Networks (EGNNs)
In contrast with existing methods, our work does not require computationally expensive higher-order representations in intermediate layers while it still achieves competitive or better performance.
arXiv Detail & Related papers (2021-02-19T10:25:33Z) - Graph Structure of Neural Networks [104.33754950606298]
We show how the graph structure of neural networks affect their predictive performance.
A "sweet spot" of relational graphs leads to neural networks with significantly improved predictive performance.
Top-performing neural networks have graph structure surprisingly similar to those of real biological neural networks.
arXiv Detail & Related papers (2020-07-13T17:59:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.