Neural Functional Transformers
- URL: http://arxiv.org/abs/2305.13546v1
- Date: Mon, 22 May 2023 23:38:27 GMT
- Title: Neural Functional Transformers
- Authors: Allan Zhou, Kaien Yang, Yiding Jiang, Kaylee Burns, Winnie Xu, Samuel
Sokota, J. Zico Kolter, Chelsea Finn
- Abstract summary: This paper uses the attention mechanism to define a novel set of permutation equivariant weight-space layers called neural functional Transformers (NFTs)
NFTs respect weight-space permutation symmetries while incorporating the advantages of attention, which have exhibited remarkable success across multiple domains.
We also leverage NFTs to develop Inr2Array, a novel method for computing permutation invariant representations from the weights of implicit neural representations (INRs)
- Score: 99.98750156515437
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent success of neural networks as implicit representation of data has
driven growing interest in neural functionals: models that can process other
neural networks as input by operating directly over their weight spaces.
Nevertheless, constructing expressive and efficient neural functional
architectures that can handle high-dimensional weight-space objects remains
challenging. This paper uses the attention mechanism to define a novel set of
permutation equivariant weight-space layers and composes them into deep
equivariant models called neural functional Transformers (NFTs). NFTs respect
weight-space permutation symmetries while incorporating the advantages of
attention, which have exhibited remarkable success across multiple domains. In
experiments processing the weights of feedforward MLPs and CNNs, we find that
NFTs match or exceed the performance of prior weight-space methods. We also
leverage NFTs to develop Inr2Array, a novel method for computing permutation
invariant latent representations from the weights of implicit neural
representations (INRs). Our proposed method improves INR classification
accuracy by up to $+17\%$ over existing methods. We provide an implementation
of our layers at https://github.com/AllanYangZhou/nfn.
Related papers
- Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Universal Neural Functionals [67.80283995795985]
A challenging problem in many modern machine learning tasks is to process weight-space features.
Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks.
This work proposes an algorithm that automatically constructs permutation equivariant models for any weight space.
arXiv Detail & Related papers (2024-02-07T20:12:27Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Smooth Mathematical Function from Compact Neural Networks [0.0]
We get NNs that generate highly accurate and highly smooth function, which only comprised of a few weight parameters.
New activation function, meta-batch method, features of numerical data, meta-augmentation with meta parameters are presented.
arXiv Detail & Related papers (2022-12-31T11:33:24Z) - Variational Tensor Neural Networks for Deep Learning [0.0]
We propose an integration of tensor networks (TN) into deep neural networks (NNs)
This in turn, results in a scalable tensor neural network (TNN) architecture capable of efficient training over a large parameter space.
We validate the accuracy and efficiency of our method by designing TNN models and providing benchmark results for linear and non-linear regressions, data classification and image recognition on MNIST handwritten digits.
arXiv Detail & Related papers (2022-11-26T20:24:36Z) - Revisiting Transformation Invariant Geometric Deep Learning: Are Initial
Representations All You Need? [80.86819657126041]
We show that transformation-invariant and distance-preserving initial representations are sufficient to achieve transformation invariance.
Specifically, we realize transformation-invariant and distance-preserving initial point representations by modifying multi-dimensional scaling.
We prove that TinvNN can strictly guarantee transformation invariance, being general and flexible enough to be combined with the existing neural networks.
arXiv Detail & Related papers (2021-12-23T03:52:33Z) - E(n) Equivariant Graph Neural Networks [86.75170631724548]
This paper introduces a new model to learn graph neural networks equivariant to rotations, translations, reflections and permutations called E(n)-Equivariant Graph Neural Networks (EGNNs)
In contrast with existing methods, our work does not require computationally expensive higher-order representations in intermediate layers while it still achieves competitive or better performance.
arXiv Detail & Related papers (2021-02-19T10:25:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.