Related papers: Universal Neural Functionals

Universal Neural Functionals

URL: http://arxiv.org/abs/2402.05232v1
Date: Wed, 7 Feb 2024 20:12:27 GMT
Title: Universal Neural Functionals
Authors: Allan Zhou, Chelsea Finn, James Harrison
Abstract summary: A challenging problem in many modern machine learning tasks is to process weight-space features. Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks. This work proposes an algorithm that automatically constructs permutation equivariant models for any weight space.
Score: 67.80283995795985
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A challenging problem in many modern machine learning tasks is to process weight-space features, i.e., to transform or extract information from the weights and gradients of a neural network. Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks. However, they are not applicable to general architectures, since the permutation symmetries of a weight space can be complicated by recurrence or residual connections. This work proposes an algorithm that automatically constructs permutation equivariant models, which we refer to as universal neural functionals (UNFs), for any weight space. Among other applications, we demonstrate how UNFs can be substituted into existing learned optimizer designs, and find promising improvements over prior methods when optimizing small image classifiers and language models. Our results suggest that learned optimizers can benefit from considering the (symmetry) structure of the weight space they optimize. We open-source our library for constructing UNFs at https://github.com/AllanYangZhou/universal_neural_functional.

Related papers

The Persian Rug: solving toy models of superposition using large-scale symmetries [0.0]
We present a complete mechanistic description of the algorithm learned by a minimal non-linear sparse data autoencoder in the limit of large input dimension. Our work contributes to neural network interpretability by introducing techniques for understanding the structure of autoencoders.
arXiv Detail & Related papers (2024-10-15T22:52:45Z)
Improving Equivariant Model Training via Constraint Relaxation [31.507956579770088]
We propose a novel framework for improving the optimization of such models by relaxing the hard equivariance constraint during training. We provide experimental results on different state-of-the-art network architectures, demonstrating how this training framework can result in equivariant models with improved generalization performance.
arXiv Detail & Related papers (2024-08-23T17:35:08Z)
Neural approximation of Wasserstein distance via a universal architecture for symmetric and factorwise group invariant functions [6.994580267603235]
We first present a general neural network architecture for approximating SFGI functions. The main contribution of this paper combines this general neural network with a sketching idea to develop a specific and efficient neural network. Our work provides an interesting integration of sketching ideas for geometric problems with universal approximation of symmetric functions.
arXiv Detail & Related papers (2023-08-01T04:11:19Z)
Neural Functional Transformers [99.98750156515437]
This paper uses the attention mechanism to define a novel set of permutation equivariant weight-space layers called neural functional Transformers (NFTs) NFTs respect weight-space permutation symmetries while incorporating the advantages of attention, which have exhibited remarkable success across multiple domains. We also leverage NFTs to develop Inr2Array, a novel method for computing permutation invariant representations from the weights of implicit neural representations (INRs)
arXiv Detail & Related papers (2023-05-22T23:38:27Z)
Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks. We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order. In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z)
Equivariant Architectures for Learning in Deep Weight Spaces [54.61765488960555]
We present a novel network architecture for learning in deep weight spaces. It takes as input a concatenation of weights and biases of a pre-trainedvariant. We show how these layers can be implemented using three basic operations.
arXiv Detail & Related papers (2023-01-30T10:50:33Z)
Improving the Sample-Complexity of Deep Classification Networks with Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks. We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems. We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z)
Frame Averaging for Invariant and Equivariant Network Design [50.87023773850824]
We introduce Frame Averaging (FA), a framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types. We show that FA-based models have maximal expressive power in a broad setting. We propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs.
arXiv Detail & Related papers (2021-10-07T11:05:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.