Set-based Neural Network Encoding
- URL: http://arxiv.org/abs/2305.16625v1
- Date: Fri, 26 May 2023 04:34:28 GMT
- Title: Set-based Neural Network Encoding
- Authors: Bruno Andreis, Soro Bedionita, Sung Ju Hwang
- Abstract summary: We propose an approach to neural network weight encoding for generalization performance prediction.
Our approach is capable of encoding neural networks in a modelzoo of mixed architecture.
We introduce two new tasks for neural network generalization performance prediction: cross-dataset and cross-architecture.
- Score: 57.15855198512551
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose an approach to neural network weight encoding for generalization
performance prediction that utilizes set-to-set and set-to-vector functions to
efficiently encode neural network parameters. Our approach is capable of
encoding neural networks in a modelzoo of mixed architecture and different
parameter sizes as opposed to previous approaches that require custom encoding
models for different architectures. Furthermore, our \textbf{S}et-based
\textbf{N}eural network \textbf{E}ncoder (SNE) takes into consideration the
hierarchical computational structure of neural networks by utilizing a
layer-wise encoding scheme that culminates to encoding all layer-wise encodings
to obtain the neural network encoding vector. Additionally, we introduce a
\textit{pad-chunk-encode} pipeline to efficiently encode neural network layers
that is adjustable to computational and memory constraints. We also introduce
two new tasks for neural network generalization performance prediction:
cross-dataset and cross-architecture. In cross-dataset performance prediction,
we evaluate how well performance predictors generalize across modelzoos trained
on different datasets but of the same architecture. In cross-architecture
performance prediction, we evaluate how well generalization performance
predictors transfer to modelzoos of different architecture. Experimentally, we
show that SNE outperforms the relevant baselines on the cross-dataset task and
provide the first set of results on the cross-architecture task.
Related papers
- SWAT-NN: Simultaneous Weights and Architecture Training for Neural Networks in a Latent Space [6.2241272327831485]
We propose a framework that simultaneously optimize both the architecture and the weights of a neural network.<n>Our framework first trains a universal multi-scale autoencoder that embeds both architectural and parametric information into a continuous latent space.<n>Given a dataset, we then randomly initialize a point in the embedding space and update it via gradient descent to obtain the optimal neural network.
arXiv Detail & Related papers (2025-06-09T22:22:37Z) - Simultaneous Weight and Architecture Optimization for Neural Networks [6.2241272327831485]
We introduce a novel neural network training framework that transforms the process by learning architecture and parameters simultaneously with gradient descent.
Central to our approach is a multi-scale encoder-decoder, in which the encoder embeds pairs of neural networks with similar functionalities close to each other.
Experiments demonstrate that our framework can discover sparse and compact neural networks maintaining a high performance.
arXiv Detail & Related papers (2024-10-10T19:57:36Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Graph Metanetworks for Processing Diverse Neural Architectures [33.686728709734105]
Graph Metanetworks (GMNs) generalizes to neural architectures where competing methods struggle.
We prove that GMNs are expressive and equivariant to parameter permutation symmetries that leave the input neural network functions.
arXiv Detail & Related papers (2023-12-07T18:21:52Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Differentiable Neural Architecture Learning for Efficient Neural Network
Design [31.23038136038325]
We introduce a novel emph architecture parameterisation based on scaled sigmoid function.
We then propose a general emphiable Neural Architecture Learning (DNAL) method to optimize the neural architecture without the need to evaluate candidate neural networks.
arXiv Detail & Related papers (2021-03-03T02:03:08Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - Inferring Convolutional Neural Networks' accuracies from their
architectural characterizations [0.0]
We study the relationships between a CNN's architecture and its performance.
We show that the attributes can be predictive of the networks' performance in two specific computer vision-based physics problems.
We use machine learning models to predict whether a network can perform better than a certain threshold accuracy before training.
arXiv Detail & Related papers (2020-01-07T16:41:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.