Related papers: Neural Networks with A La Carte Selection of Activation Functions

Neural Networks with A La Carte Selection of Activation Functions

URL: http://arxiv.org/abs/2206.12166v1
Date: Fri, 24 Jun 2022 09:09:39 GMT
Title: Neural Networks with A La Carte Selection of Activation Functions
Authors: Moshe Sipper
Abstract summary: Activation functions (AFs) are pivotal to the success (or failure) of a neural network. We combine a slew of known AFs into successful architectures, proposing three methods to do so beneficially. We show that all methods often produce significantly better results for 25 classification problems when compared with a standard network composed of ReLU hidden units and a softmax output unit.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Activation functions (AFs), which are pivotal to the success (or failure) of a neural network, have received increased attention in recent years, with researchers seeking to design novel AFs that improve some aspect of network performance. In this paper we take another direction, wherein we combine a slew of known AFs into successful architectures, proposing three methods to do so beneficially: 1) generate AF architectures at random, 2) use Optuna, an automatic hyper-parameter optimization software framework, with a Tree-structured Parzen Estimator (TPE) sampler, and 3) use Optuna with a Covariance Matrix Adaptation Evolution Strategy (CMA-ES) sampler. We show that all methods often produce significantly better results for 25 classification problems when compared with a standard network composed of ReLU hidden units and a softmax output unit. Optuna with the TPE sampler emerged as the best AF architecture-producing method.

Related papers

Adaptive Integrated Layered Attention (AILA) [0.0]
We propose Adaptive Layered Integrated Attention (AILA), a neural network architecture that combines dense skip connections with different mechanisms for adaptive feature reuse across network layers. We evaluate AILA on three challenging tasks: price forecasting for various commodities and indices, image recognition using the CIFAR-10 dataset, and sentiment analysis on the IMDB movie review dataset. Results confirm that AILA's adaptive inter-layer connections yield robust gains by flexibly reusing pertinent features at multiple network depths.
arXiv Detail & Related papers (2025-03-26T19:32:31Z)
POMONAG: Pareto-Optimal Many-Objective Neural Architecture Generator [4.09225917049674]
Transferable NAS has emerged, generalizing the search process from dataset-dependent to task-dependent. This paper introduces POMONAG, extending DiffusionNAG via a many-optimal diffusion process. Results were validated on two search spaces -- NAS201 and MobileNetV3 -- and evaluated across 15 image classification datasets.
arXiv Detail & Related papers (2024-09-30T16:05:29Z)
HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel Neural Architecture Search [104.45426861115972]
We propose to directly generate structural parameters by utilizing the specifically designed hyper kernels. We obtain three kinds of networks to separately conduct pixel-level or image-level classifications with 1-D or 3-D convolutions. A series of experiments on six public datasets demonstrate that the proposed methods achieve state-of-the-art results.
arXiv Detail & Related papers (2023-04-23T17:27:40Z)
Enhancing Once-For-All: A Study on Parallel Blocks, Skip Connections and Early Exits [7.0895962209555465]
Once-For-All (OFA) is an eco-friendly algorithm characterised by the ability to generate easily adaptable models. OFA is improved from an architectural point of view by including early exits, parallel blocks and dense skip connections. OFAAv2 improves its accuracy performance on the Tiny ImageNet dataset by up to 12.07% compared to the original version of OFA.
arXiv Detail & Related papers (2023-02-03T17:53:40Z)
MRF-UNets: Searching UNet with Markov Random Fields [25.607512500358723]
We propose MRF-NAS that extends and improves the recent Adaptive and Optimal Network Width Search (AOWS) method. We find an architecture, MRF-UNet, that shows several interesting characteristics. Experiments show that our MRF-UNets significantly outperform several benchmarks on three aerial image datasets and two medical image datasets.
arXiv Detail & Related papers (2022-07-13T13:04:18Z)
Evolution of Activation Functions for Deep Learning-Based Image Classification [0.0]
Activation functions (AFs) play a pivotal role in the performance of neural networks. We propose a novel, three-population, coevolutionary algorithm to evolve AFs. Tested on four datasets -- MNIST, FashionMNIST, KMNIST, and USPS -- coevolution proves to be a performant algorithm for finding good AFs and AF architectures.
arXiv Detail & Related papers (2022-06-24T05:58:23Z)
FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking. We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints. FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z)
Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet. Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs) Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z)
Recent Developments Combining Ensemble Smoother and Deep Generative Networks for Facies History Matching [58.720142291102135]
This research project focuses on the use of autoencoders networks to construct a continuous parameterization for facies models. We benchmark seven different formulations, including VAE, generative adversarial network (GAN), Wasserstein GAN, variational auto-encoding GAN, principal component analysis (PCA) with cycle GAN, PCA with transfer style network and VAE with style loss.
arXiv Detail & Related papers (2020-05-08T21:32:42Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD) ASFD is based on a combination of neural architecture search techniques as well as a new loss design. Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.