Graph Metanetworks for Processing Diverse Neural Architectures
- URL: http://arxiv.org/abs/2312.04501v2
- Date: Fri, 29 Dec 2023 22:55:45 GMT
- Title: Graph Metanetworks for Processing Diverse Neural Architectures
- Authors: Derek Lim, Haggai Maron, Marc T. Law, Jonathan Lorraine, James Lucas
- Abstract summary: Graph Metanetworks (GMNs) generalizes to neural architectures where competing methods struggle.
We prove that GMNs are expressive and equivariant to parameter permutation symmetries that leave the input neural network functions.
- Score: 33.686728709734105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural networks efficiently encode learned information within their
parameters. Consequently, many tasks can be unified by treating neural networks
themselves as input data. When doing so, recent studies demonstrated the
importance of accounting for the symmetries and geometry of parameter spaces.
However, those works developed architectures tailored to specific networks such
as MLPs and CNNs without normalization layers, and generalizing such
architectures to other types of networks can be challenging. In this work, we
overcome these challenges by building new metanetworks - neural networks that
take weights from other neural networks as input. Put simply, we carefully
build graphs representing the input neural networks and process the graphs
using graph neural networks. Our approach, Graph Metanetworks (GMNs),
generalizes to neural architectures where competing methods struggle, such as
multi-head attention layers, normalization layers, convolutional layers, ResNet
blocks, and group-equivariant linear layers. We prove that GMNs are expressive
and equivariant to parameter permutation symmetries that leave the input neural
network functions unchanged. We validate the effectiveness of our method on
several metanetwork tasks over diverse neural network architectures.
Related papers
- Deep Neural Networks via Complex Network Theory: a Perspective [3.1023851130450684]
Deep Neural Networks (DNNs) can be represented as graphs whose links and vertices iteratively process data and solve tasks sub-optimally. Complex Network Theory (CNT), merging statistical physics with graph theory, provides a method for interpreting neural networks by analysing their weights and neuron structures.
In this work, we extend the existing CNT metrics with measures that sample from the DNNs' training distribution, shifting from a purely topological analysis to one that connects with the interpretability of deep learning.
arXiv Detail & Related papers (2024-04-17T08:42:42Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Equivariant Matrix Function Neural Networks [1.8717045355288808]
We introduce Matrix Function Neural Networks (MFNs), a novel architecture that parameterizes non-local interactions through analytic matrix equivariant functions.
MFNs is able to capture intricate non-local interactions in quantum systems, paving the way to new state-of-the-art force fields.
arXiv Detail & Related papers (2023-10-16T14:17:00Z) - Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction.
Our approach is capable of encoding neural networks in a model zoo of mixed architecture.
We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Deep Neural Networks as Complex Networks [1.704936863091649]
We use Complex Network Theory to represent Deep Neural Networks (DNNs) as directed weighted graphs.
We introduce metrics to study DNNs as dynamical systems, with a granularity that spans from weights to layers, including neurons.
We show that our metrics discriminate low vs. high performing networks.
arXiv Detail & Related papers (2022-09-12T16:26:04Z) - Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths.
Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z) - Graph Structure of Neural Networks [104.33754950606298]
We show how the graph structure of neural networks affect their predictive performance.
A "sweet spot" of relational graphs leads to neural networks with significantly improved predictive performance.
Top-performing neural networks have graph structure surprisingly similar to those of real biological neural networks.
arXiv Detail & Related papers (2020-07-13T17:59:31Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.