Related papers: Noisy Heuristics NAS: A Network Morphism based Neural Architecture Search using Heuristics

Noisy Heuristics NAS: A Network Morphism based Neural Architecture Search using Heuristics

URL: http://arxiv.org/abs/2207.04467v1
Date: Sun, 10 Jul 2022 13:58:21 GMT
Title: Noisy Heuristics NAS: A Network Morphism based Neural Architecture Search using Heuristics
Authors: Suman Sapkota and Binod Bhattarai
Abstract summary: We present a new Network Morphism based NAS called Noisy Heuristics NAS. We add new neurons randomly and prune away some to select only the best fitting neurons. Our method generalizes both on toy datasets and on real-world data sets such as MNIST, CIFAR-10, and CIFAR-100.
Score: 11.726528038065764
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Network Morphism based Neural Architecture Search (NAS) is one of the most efficient methods, however, knowing where and when to add new neurons or remove dis-functional ones is generally left to black-box Reinforcement Learning models. In this paper, we present a new Network Morphism based NAS called Noisy Heuristics NAS which uses heuristics learned from manually developing neural network models and inspired by biological neuronal dynamics. Firstly, we add new neurons randomly and prune away some to select only the best fitting neurons. Secondly, we control the number of layers in the network using the relationship of hidden units to the number of input-output connections. Our method can increase or decrease the capacity or non-linearity of models online which is specified with a few meta-parameters by the user. Our method generalizes both on toy datasets and on real-world data sets such as MNIST, CIFAR-10, and CIFAR-100. The performance is comparable to the hand-engineered architecture ResNet-18 with the similar parameters.

Related papers

Simultaneous Weight and Architecture Optimization for Neural Networks [6.2241272327831485]
We introduce a novel neural network training framework that transforms the process by learning architecture and parameters simultaneously with gradient descent. Central to our approach is a multi-scale encoder-decoder, in which the encoder embeds pairs of neural networks with similar functionalities close to each other. Experiments demonstrate that our framework can discover sparse and compact neural networks maintaining a high performance.
arXiv Detail & Related papers (2024-10-10T19:57:36Z)
NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance [0.0]
We propose a zero-cost proxy Network Expressivity by Activation Rank (NEAR) to identify the optimal neural network without training. We demonstrate the cutting-edge correlation between this network score and the model accuracy on NAS-Bench-101 and NATS-Bench-SSS/TSS.
arXiv Detail & Related papers (2024-08-16T14:38:14Z)
DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques. Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms. Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z)
NASiam: Efficient Representation Learning using Neural Architecture Search for Siamese Networks [76.8112416450677]
Siamese networks are one of the most trending methods to achieve self-supervised visual representation learning (SSL) NASiam is a novel approach that uses for the first time differentiable NAS to improve the multilayer perceptron projector and predictor (encoder/predictor pair) NASiam reaches competitive performance in both small-scale (i.e., CIFAR-10/CIFAR-100) and large-scale (i.e., ImageNet) image classification datasets while costing only a few GPU hours.
arXiv Detail & Related papers (2023-01-31T19:48:37Z)
Evolutionary Neural Cascade Search across Supernetworks [68.8204255655161]
We introduce ENCAS - Evolutionary Neural Cascade Search. ENCAS can be used to search over multiple pretrained supernetworks. We test ENCAS on common computer vision benchmarks.
arXiv Detail & Related papers (2022-03-08T11:06:01Z)
Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective [88.39981851247727]
We propose a novel framework called training-free neural architecture search (TE-NAS) TE-NAS ranks architectures by analyzing the spectrum of the neural tangent kernel (NTK) and the number of linear regions in the input space. We show that: (1) these two measurements imply the trainability and expressivity of a neural network; (2) they strongly correlate with the network's test accuracy.
arXiv Detail & Related papers (2021-02-23T07:50:44Z)
Evolutionary Neural Architecture Search Supporting Approximate Multipliers [0.5414308305392761]
We propose a multi-objective NAS method based on Cartesian genetic programming for evolving convolutional neural networks (CNN) The most suitable approximate multipliers are automatically selected from a library of approximate multipliers. Evolved CNNs are compared with common human-created CNNs of a similar complexity on the CIFAR-10 benchmark problem.
arXiv Detail & Related papers (2021-01-28T09:26:03Z)
Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. We show how to extend the architecture of a simple RNN by separating its hidden state into different modules. We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)
Deep Learning in Memristive Nanowire Networks [0.0]
A new hardware architecture, dubbed the MN3 (Memristive Nanowire Neural Network), was recently described as an efficient architecture for simulating very wide, sparse neural network layers. We show that the MN3 is capable of performing composition, gradient propagation, and weight updates, which together allow it to function as a deep neural network.
arXiv Detail & Related papers (2020-03-03T20:11:33Z)
Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy. We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.