Inferring Convolutional Neural Networks' accuracies from their
architectural characterizations
- URL: http://arxiv.org/abs/2001.02160v2
- Date: Fri, 10 Jan 2020 03:52:48 GMT
- Title: Inferring Convolutional Neural Networks' accuracies from their
architectural characterizations
- Authors: Duc Hoang (1), Jesse Hamer (2), Gabriel N. Perdue (3), Steven R. Young
(4), Jonathan Miller (5), Anushree Ghosh (5) ((1) Rhodes College, (2) The
University of Iowa, (3) Fermi National Accelerator Laboratory, (4) Oak Ridge
National Laboratory, (5) Universidad T\'ecnica Federico Santa Mar\'ia)
- Abstract summary: We study the relationships between a CNN's architecture and its performance.
We show that the attributes can be predictive of the networks' performance in two specific computer vision-based physics problems.
We use machine learning models to predict whether a network can perform better than a certain threshold accuracy before training.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Convolutional Neural Networks (CNNs) have shown strong promise for analyzing
scientific data from many domains including particle imaging detectors.
However, the challenge of choosing the appropriate network architecture (depth,
kernel shapes, activation functions, etc.) for specific applications and
different data sets is still poorly understood. In this paper, we study the
relationships between a CNN's architecture and its performance by proposing a
systematic language that is useful for comparison between different CNN's
architectures before training time. We characterize CNN's architecture by
different attributes, and demonstrate that the attributes can be predictive of
the networks' performance in two specific computer vision-based physics
problems -- event vertex finding and hadron multiplicity classification in the
MINERvA experiment at Fermi National Accelerator Laboratory. In doing so, we
extract several architectural attributes from optimized networks' architecture
for the physics problems, which are outputs of a model selection algorithm
called Multi-node Evolutionary Neural Networks for Deep Learning (MENNDL). We
use machine learning models to predict whether a network can perform better
than a certain threshold accuracy before training. The models perform 16-20%
better than random guessing. Additionally, we found an coefficient of
determination of 0.966 for an Ordinary Least Squares model in a regression on
accuracy over a large population of networks.
Related papers
- Simultaneous Weight and Architecture Optimization for Neural Networks [6.2241272327831485]
We introduce a novel neural network training framework that transforms the process by learning architecture and parameters simultaneously with gradient descent.
Central to our approach is a multi-scale encoder-decoder, in which the encoder embeds pairs of neural networks with similar functionalities close to each other.
Experiments demonstrate that our framework can discover sparse and compact neural networks maintaining a high performance.
arXiv Detail & Related papers (2024-10-10T19:57:36Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction.
Our approach is capable of encoding neural networks in a model zoo of mixed architecture.
We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z) - NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z) - Convolution Neural Network Hyperparameter Optimization Using Simplified
Swarm Optimization [2.322689362836168]
Convolutional Neural Network (CNN) is widely used in computer vision.
It is not easy to find a network architecture with better performance.
arXiv Detail & Related papers (2021-03-06T00:23:27Z) - Differentiable Neural Architecture Learning for Efficient Neural Network
Design [31.23038136038325]
We introduce a novel emph architecture parameterisation based on scaled sigmoid function.
We then propose a general emphiable Neural Architecture Learning (DNAL) method to optimize the neural architecture without the need to evaluate candidate neural networks.
arXiv Detail & Related papers (2021-03-03T02:03:08Z) - Firefly Neural Architecture Descent: a General Approach for Growing
Neural Networks [50.684661759340145]
Firefly neural architecture descent is a general framework for progressively and dynamically growing neural networks.
We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures.
In particular, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.
arXiv Detail & Related papers (2021-02-17T04:47:18Z) - Genetic U-Net: Automatically Designed Deep Networks for Retinal Vessel
Segmentation Using a Genetic Algorithm [2.6629444004809826]
Genetic U-Net is proposed to generate a U-shaped convolutional neural network (CNN) that can achieve better retinal vessel segmentation but with fewer architecture-based parameters.
The experimental results show that the architecture obtained using the proposed method offered a superior performance with less than 1% of the number of the original U-Net parameters in particular.
arXiv Detail & Related papers (2020-10-29T13:31:36Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - Analyzing Neural Networks Based on Random Graphs [77.34726150561087]
We perform a massive evaluation of neural networks with architectures corresponding to random graphs of various types.
We find that none of the classical numerical graph invariants by itself allows to single out the best networks.
We also find that networks with primarily short-range connections perform better than networks which allow for many long-range connections.
arXiv Detail & Related papers (2020-02-19T11:04:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.