Related papers: Adaptive Activation Functions for Predictive Modeling with Sparse Experimental Data

Adaptive Activation Functions for Predictive Modeling with Sparse Experimental Data

URL: http://arxiv.org/abs/2402.05401v1
Date: Thu, 8 Feb 2024 04:35:09 GMT
Title: Adaptive Activation Functions for Predictive Modeling with Sparse Experimental Data
Authors: Farhad Pourkamali-Anaraki, Tahamina Nasrin, Robert E. Jensen, Amy M. Peterson, Christopher J. Hansen
Abstract summary: This study investigates the influence of adaptive or trainable activation functions on classification accuracy and predictive uncertainty in settings characterized by limited data availability. Our investigation reveals that adaptive activation functions, such as Exponential Linear Unit (ELU) and Softplus, with individual trainable parameters, result in accurate and confident prediction models.
Score: 2.012425476229879
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A pivotal aspect in the design of neural networks lies in selecting activation functions, crucial for introducing nonlinear structures that capture intricate input-output patterns. While the effectiveness of adaptive or trainable activation functions has been studied in domains with ample data, like image classification problems, significant gaps persist in understanding their influence on classification accuracy and predictive uncertainty in settings characterized by limited data availability. This research aims to address these gaps by investigating the use of two types of adaptive activation functions. These functions incorporate shared and individual trainable parameters per hidden layer and are examined in three testbeds derived from additive manufacturing problems containing fewer than one hundred training instances. Our investigation reveals that adaptive activation functions, such as Exponential Linear Unit (ELU) and Softplus, with individual trainable parameters, result in accurate and confident prediction models that outperform fixed-shape activation functions and the less flexible method of using identical trainable activation functions in a hidden layer. Therefore, this work presents an elegant way of facilitating the design of adaptive neural networks in scientific and engineering problems.

Related papers

Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach [87.8330887605381]
We show how to adapt a pre-trained Vision Transformer to downstream recognition tasks with only a few learnable parameters. We synthesize a task-specific query with a learnable and lightweight module, which is independent of the pre-trained model. Our method achieves state-of-the-art performance under memory constraints, showcasing its applicability in real-world situations.
arXiv Detail & Related papers (2024-07-09T15:45:04Z)
ENN: A Neural Network with DCT Adaptive Activation Functions [2.2713084727838115]
We present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT) This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks. The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
arXiv Detail & Related papers (2023-07-02T21:46:30Z)
Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks. We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z)
Bayesian optimization for sparse neural networks with trainable activation functions [0.0]
We propose a trainable activation function whose parameters need to be estimated. A fully Bayesian model is developed to automatically estimate from the learning data both the model weights and activation function parameters.
arXiv Detail & Related papers (2023-04-10T08:44:44Z)
Efficient Activation Function Optimization through Surrogate Modeling [15.219959721479835]
This paper aims to improve the state of the art through three steps. First, the benchmark Act-Bench-CNN, Act-Bench-ResNet, and Act-Bench-ViT were created by training convolutional, residual, and vision transformer architectures. Second, a characterization of the benchmark space was developed, leading to a new surrogate-based method for optimization.
arXiv Detail & Related papers (2023-01-13T23:11:14Z)
Stochastic Adaptive Activation Function [1.9199289015460212]
This study proposes a simple yet effective activation function that facilitates different thresholds and adaptive activations according to the positions of units and the contexts of inputs. Experimental analysis demonstrates that our activation function can provide the benefits of more accurate prediction and earlier convergence in many deep learning applications.
arXiv Detail & Related papers (2022-10-21T01:57:25Z)
Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data. RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z)
Squashing activation functions in benchmark tests: towards eXplainable Artificial Intelligence using continuous-valued logic [0.0]
This work demonstrates the first benchmark tests that measure the performance of Squashing functions in neural networks. Three experiments were carried out to examine their usability and a comparison with the most popular activation functions was made for five different network types. Results indicate that due to the embedded nilpotent logical operators and the differentiability of the Squashing function, it is possible to solve classification problems.
arXiv Detail & Related papers (2020-10-17T10:42:40Z)
Estimating Structural Target Functions using Machine Learning and Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models. This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics. We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z)
Influence Functions in Deep Learning Are Fragile [52.31375893260445]
influence functions approximate the effect of samples in test-time predictions. influence estimates are fairly accurate for shallow networks. Hessian regularization is important to get highquality influence estimates.
arXiv Detail & Related papers (2020-06-25T18:25:59Z)
Towards Efficient Processing and Learning with Spikes: New Approaches for Multi-Spike Learning [59.249322621035056]
We propose two new multi-spike learning rules which demonstrate better performance over other baselines on various tasks. In the feature detection task, we re-examine the ability of unsupervised STDP with its limitations being presented. Our proposed learning rules can reliably solve the task over a wide range of conditions without specific constraints being applied.
arXiv Detail & Related papers (2020-05-02T06:41:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.