Equivariant Architectures for Learning in Deep Weight Spaces
- URL: http://arxiv.org/abs/2301.12780v2
- Date: Wed, 31 May 2023 19:24:08 GMT
- Title: Equivariant Architectures for Learning in Deep Weight Spaces
- Authors: Aviv Navon, Aviv Shamsian, Idan Achituve, Ethan Fetaya, Gal Chechik,
Haggai Maron
- Abstract summary: We present a novel network architecture for learning in deep weight spaces.
It takes as input a concatenation of weights and biases of a pre-trainedvariant.
We show how these layers can be implemented using three basic operations.
- Score: 54.61765488960555
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Designing machine learning architectures for processing neural networks in
their raw weight matrix form is a newly introduced research direction.
Unfortunately, the unique symmetry structure of deep weight spaces makes this
design very challenging. If successful, such architectures would be capable of
performing a wide range of intriguing tasks, from adapting a pre-trained
network to a new domain to editing objects represented as functions (INRs or
NeRFs). As a first step towards this goal, we present here a novel network
architecture for learning in deep weight spaces. It takes as input a
concatenation of weights and biases of a pre-trained MLP and processes it using
a composition of layers that are equivariant to the natural permutation
symmetry of the MLP's weights: Changing the order of neurons in intermediate
layers of the MLP does not affect the function it represents. We provide a full
characterization of all affine equivariant and invariant layers for these
symmetries and show how these layers can be implemented using three basic
operations: pooling, broadcasting, and fully connected layers applied to the
input in an appropriate manner. We demonstrate the effectiveness of our
architecture and its advantages over natural baselines in a variety of learning
tasks.
Related papers
- EKAN: Equivariant Kolmogorov-Arnold Networks [69.30866522377694]
Kolmogorov-Arnold Networks (KANs) have seen great success in scientific domains.
However, spline functions may not respect symmetry in tasks, which is crucial prior knowledge in machine learning.
We propose Equivariant Kolmogorov-Arnold Networks (EKAN) to broaden their applicability to more fields.
arXiv Detail & Related papers (2024-10-01T06:34:58Z) - Universal Neural Functionals [67.80283995795985]
A challenging problem in many modern machine learning tasks is to process weight-space features.
Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks.
This work proposes an algorithm that automatically constructs permutation equivariant models for any weight space.
arXiv Detail & Related papers (2024-02-07T20:12:27Z) - Neural Functional Transformers [99.98750156515437]
This paper uses the attention mechanism to define a novel set of permutation equivariant weight-space layers called neural functional Transformers (NFTs)
NFTs respect weight-space permutation symmetries while incorporating the advantages of attention, which have exhibited remarkable success across multiple domains.
We also leverage NFTs to develop Inr2Array, a novel method for computing permutation invariant representations from the weights of implicit neural representations (INRs)
arXiv Detail & Related papers (2023-05-22T23:38:27Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Engineering flexible machine learning systems by traversing
functionally-invariant paths [1.4999444543328289]
We introduce a differential geometry framework that provides flexible and continuous adaptation of neural networks.
We formalize adaptation as movement along a geodesic path in weight space while searching for networks that accommodate secondary objectives.
With modest computational resources, the FIP algorithm achieves comparable to state of the art performance on continual learning and sparsification tasks.
arXiv Detail & Related papers (2022-04-30T19:44:56Z) - SPINE: Soft Piecewise Interpretable Neural Equations [0.0]
Fully connected networks are ubiquitous but uninterpretable.
This paper takes a novel approach to piecewise fits by using set operations on individual pieces(parts)
It can find a variety of applications where fully connected layers must be replaced by interpretable layers.
arXiv Detail & Related papers (2021-11-20T16:18:00Z) - Neural Subdivision [58.97214948753937]
This paper introduces Neural Subdivision, a novel framework for data-driven coarseto-fine geometry modeling.
We optimize for the same set of network weights across all local mesh patches, thus providing an architecture that is not constrained to a specific input mesh, fixed genus, or category.
We demonstrate that even when trained on a single high-resolution mesh our method generates reasonable subdivisions for novel shapes.
arXiv Detail & Related papers (2020-05-04T20:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.