Neural Networks with A La Carte Selection of Activation Functions
- URL: http://arxiv.org/abs/2206.12166v1
- Date: Fri, 24 Jun 2022 09:09:39 GMT
- Title: Neural Networks with A La Carte Selection of Activation Functions
- Authors: Moshe Sipper
- Abstract summary: Activation functions (AFs) are pivotal to the success (or failure) of a neural network.
We combine a slew of known AFs into successful architectures, proposing three methods to do so beneficially.
We show that all methods often produce significantly better results for 25 classification problems when compared with a standard network composed of ReLU hidden units and a softmax output unit.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Activation functions (AFs), which are pivotal to the success (or failure) of
a neural network, have received increased attention in recent years, with
researchers seeking to design novel AFs that improve some aspect of network
performance. In this paper we take another direction, wherein we combine a slew
of known AFs into successful architectures, proposing three methods to do so
beneficially: 1) generate AF architectures at random, 2) use Optuna, an
automatic hyper-parameter optimization software framework, with a
Tree-structured Parzen Estimator (TPE) sampler, and 3) use Optuna with a
Covariance Matrix Adaptation Evolution Strategy (CMA-ES) sampler. We show that
all methods often produce significantly better results for 25 classification
problems when compared with a standard network composed of ReLU hidden units
and a softmax output unit. Optuna with the TPE sampler emerged as the best AF
architecture-producing method.
Related papers
- POMONAG: Pareto-Optimal Many-Objective Neural Architecture Generator [4.09225917049674]
Transferable NAS has emerged, generalizing the search process from dataset-dependent to task-dependent.
This paper introduces POMONAG, extending DiffusionNAG via a many-optimal diffusion process.
Results were validated on two search spaces -- NAS201 and MobileNetV3 -- and evaluated across 15 image classification datasets.
arXiv Detail & Related papers (2024-09-30T16:05:29Z) - HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel
Neural Architecture Search [104.45426861115972]
We propose to directly generate structural parameters by utilizing the specifically designed hyper kernels.
We obtain three kinds of networks to separately conduct pixel-level or image-level classifications with 1-D or 3-D convolutions.
A series of experiments on six public datasets demonstrate that the proposed methods achieve state-of-the-art results.
arXiv Detail & Related papers (2023-04-23T17:27:40Z) - Enhancing Once-For-All: A Study on Parallel Blocks, Skip Connections and
Early Exits [7.0895962209555465]
Once-For-All (OFA) is an eco-friendly algorithm characterised by the ability to generate easily adaptable models.
OFA is improved from an architectural point of view by including early exits, parallel blocks and dense skip connections.
OFAAv2 improves its accuracy performance on the Tiny ImageNet dataset by up to 12.07% compared to the original version of OFA.
arXiv Detail & Related papers (2023-02-03T17:53:40Z) - MRF-UNets: Searching UNet with Markov Random Fields [25.607512500358723]
We propose MRF-NAS that extends and improves the recent Adaptive and Optimal Network Width Search (AOWS) method.
We find an architecture, MRF-UNet, that shows several interesting characteristics.
Experiments show that our MRF-UNets significantly outperform several benchmarks on three aerial image datasets and two medical image datasets.
arXiv Detail & Related papers (2022-07-13T13:04:18Z) - Evolution of Activation Functions for Deep Learning-Based Image
Classification [0.0]
Activation functions (AFs) play a pivotal role in the performance of neural networks.
We propose a novel, three-population, coevolutionary algorithm to evolve AFs.
Tested on four datasets -- MNIST, FashionMNIST, KMNIST, and USPS -- coevolution proves to be a performant algorithm for finding good AFs and AF architectures.
arXiv Detail & Related papers (2022-06-24T05:58:23Z) - FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking.
We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints.
FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z) - Recent Developments Combining Ensemble Smoother and Deep Generative
Networks for Facies History Matching [58.720142291102135]
This research project focuses on the use of autoencoders networks to construct a continuous parameterization for facies models.
We benchmark seven different formulations, including VAE, generative adversarial network (GAN), Wasserstein GAN, variational auto-encoding GAN, principal component analysis (PCA) with cycle GAN, PCA with transfer style network and VAE with style loss.
arXiv Detail & Related papers (2020-05-08T21:32:42Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD)
ASFD is based on a combination of neural architecture search techniques as well as a new loss design.
Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.