Universal Neural Functionals
- URL: http://arxiv.org/abs/2402.05232v1
- Date: Wed, 7 Feb 2024 20:12:27 GMT
- Title: Universal Neural Functionals
- Authors: Allan Zhou, Chelsea Finn, James Harrison
- Abstract summary: A challenging problem in many modern machine learning tasks is to process weight-space features.
Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks.
This work proposes an algorithm that automatically constructs permutation equivariant models for any weight space.
- Score: 67.80283995795985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A challenging problem in many modern machine learning tasks is to process
weight-space features, i.e., to transform or extract information from the
weights and gradients of a neural network. Recent works have developed
promising weight-space models that are equivariant to the permutation
symmetries of simple feedforward networks. However, they are not applicable to
general architectures, since the permutation symmetries of a weight space can
be complicated by recurrence or residual connections. This work proposes an
algorithm that automatically constructs permutation equivariant models, which
we refer to as universal neural functionals (UNFs), for any weight space. Among
other applications, we demonstrate how UNFs can be substituted into existing
learned optimizer designs, and find promising improvements over prior methods
when optimizing small image classifiers and language models. Our results
suggest that learned optimizers can benefit from considering the (symmetry)
structure of the weight space they optimize. We open-source our library for
constructing UNFs at
https://github.com/AllanYangZhou/universal_neural_functional.
Related papers
- The Persian Rug: solving toy models of superposition using large-scale symmetries [0.0]
We present a complete mechanistic description of the algorithm learned by a minimal non-linear sparse data autoencoder in the limit of large input dimension.
Our work contributes to neural network interpretability by introducing techniques for understanding the structure of autoencoders.
arXiv Detail & Related papers (2024-10-15T22:52:45Z) - Neural approximation of Wasserstein distance via a universal
architecture for symmetric and factorwise group invariant functions [6.994580267603235]
We first present a general neural network architecture for approximating SFGI functions.
The main contribution of this paper combines this general neural network with a sketching idea to develop a specific and efficient neural network.
Our work provides an interesting integration of sketching ideas for geometric problems with universal approximation of symmetric functions.
arXiv Detail & Related papers (2023-08-01T04:11:19Z) - Neural Functional Transformers [99.98750156515437]
This paper uses the attention mechanism to define a novel set of permutation equivariant weight-space layers called neural functional Transformers (NFTs)
NFTs respect weight-space permutation symmetries while incorporating the advantages of attention, which have exhibited remarkable success across multiple domains.
We also leverage NFTs to develop Inr2Array, a novel method for computing permutation invariant representations from the weights of implicit neural representations (INRs)
arXiv Detail & Related papers (2023-05-22T23:38:27Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Equivariant Architectures for Learning in Deep Weight Spaces [54.61765488960555]
We present a novel network architecture for learning in deep weight spaces.
It takes as input a concatenation of weights and biases of a pre-trainedvariant.
We show how these layers can be implemented using three basic operations.
arXiv Detail & Related papers (2023-01-30T10:50:33Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types.
We show that FA-based models have maximal expressive power in a broad setting.
We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.