Set-based Neural Network Encoding
- URL: http://arxiv.org/abs/2305.16625v1
- Date: Fri, 26 May 2023 04:34:28 GMT
- Title: Set-based Neural Network Encoding
- Authors: Bruno Andreis, Soro Bedionita, Sung Ju Hwang
- Abstract summary: We propose an approach to neural network weight encoding for generalization performance prediction.
Our approach is capable of encoding neural networks in a modelzoo of mixed architecture.
We introduce two new tasks for neural network generalization performance prediction: cross-dataset and cross-architecture.
- Score: 57.15855198512551
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose an approach to neural network weight encoding for generalization
performance prediction that utilizes set-to-set and set-to-vector functions to
efficiently encode neural network parameters. Our approach is capable of
encoding neural networks in a modelzoo of mixed architecture and different
parameter sizes as opposed to previous approaches that require custom encoding
models for different architectures. Furthermore, our \textbf{S}et-based
\textbf{N}eural network \textbf{E}ncoder (SNE) takes into consideration the
hierarchical computational structure of neural networks by utilizing a
layer-wise encoding scheme that culminates to encoding all layer-wise encodings
to obtain the neural network encoding vector. Additionally, we introduce a
\textit{pad-chunk-encode} pipeline to efficiently encode neural network layers
that is adjustable to computational and memory constraints. We also introduce
two new tasks for neural network generalization performance prediction:
cross-dataset and cross-architecture. In cross-dataset performance prediction,
we evaluate how well performance predictors generalize across modelzoos trained
on different datasets but of the same architecture. In cross-architecture
performance prediction, we evaluate how well generalization performance
predictors transfer to modelzoos of different architecture. Experimentally, we
show that SNE outperforms the relevant baselines on the cross-dataset task and
provide the first set of results on the cross-architecture task.
Related papers
- Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - GENNAPE: Towards Generalized Neural Architecture Performance Estimators [25.877126553261434]
GENNAPE represents a given neural network as a Computation Graph (CG) of atomic operations.
It first learns a graph encoder via Contrastive Learning to encourage network separation by topological features.
Experiments show that GENNAPE pretrained on NAS-Bench-101 can achieve superior transferability to 5 different public neural network benchmarks.
arXiv Detail & Related papers (2022-11-30T18:27:41Z) - NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z) - Differentiable Neural Architecture Learning for Efficient Neural Network
Design [31.23038136038325]
We introduce a novel emph architecture parameterisation based on scaled sigmoid function.
We then propose a general emphiable Neural Architecture Learning (DNAL) method to optimize the neural architecture without the need to evaluate candidate neural networks.
arXiv Detail & Related papers (2021-03-03T02:03:08Z) - Self-supervised Representation Learning for Evolutionary Neural
Architecture Search [9.038625856798227]
Recently proposed neural architecture search (NAS) algorithms adopt neural predictors to accelerate the architecture search.
How to obtain a neural predictor with high prediction accuracy using a small amount of training data is a central problem to neural predictor-based NAS.
We devise two self-supervised learning methods to pre-train the architecture embedding part of neural predictors.
We achieve state-of-the-art performance on the NASBench-101 and NASBench201 benchmarks when integrating the pre-trained neural predictors with an evolutionary NAS algorithm.
arXiv Detail & Related papers (2020-10-31T04:57:16Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.