Trainable Compound Activation Functions for Machine Learning
- URL: http://arxiv.org/abs/2204.12920v1
- Date: Mon, 25 Apr 2022 19:53:04 GMT
- Title: Trainable Compound Activation Functions for Machine Learning
- Authors: Paul M. Baggenstoss
- Abstract summary: Activation functions (AF) are necessary components of neural networks that allow approximation of functions.
We propose trainable compound AF (TCA) composed of a sum of shifted and scaled simple AFs.
- Score: 13.554038901140949
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Activation functions (AF) are necessary components of neural networks that
allow approximation of functions, but AFs in current use are usually simple
monotonically increasing functions. In this paper, we propose trainable
compound AF (TCA) composed of a sum of shifted and scaled simple AFs. TCAs
increase the effectiveness of networks with fewer parameters compared to added
layers. TCAs have a special interpretation in generative networks because they
effectively estimate the marginal distributions of each dimension of the data
using a mixture distribution, reducing modality and making linear dimension
reduction more effective. When used in restricted Boltzmann machines (RBMs),
they result in a novel type of RBM with mixture-based stochastic units.
Improved performance is demonstrated in experiments using RBMs, deep belief
networks (DBN), projected belief networks (PBN), and variational auto-encoders
(VAE).
Related papers
- JotlasNet: Joint Tensor Low-Rank and Attention-based Sparse Unrolling Network for Accelerating Dynamic MRI [6.081607038128913]
We propose a novel deep unrolling network, JotlasNet, for dynamic MRI reconstruction.
Joint low-rank and sparse unrolling networks have shown superior performance in dynamic MRI reconstruction.
arXiv Detail & Related papers (2025-02-17T12:43:04Z) - Trainable Adaptive Activation Function Structure (TAAFS) Enhances Neural Network Force Field Performance with Only Dozens of Additional Parameters [0.0]
Trainable Adaptive Function Activation Structure (TAAFS)
We introduce a method that selects distinct mathematical formulations for non-linear activations.
In this study, we integrate TAAFS into a variety of neural network models, resulting in observed accuracy improvements.
arXiv Detail & Related papers (2024-12-19T09:06:39Z) - Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.
A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.
The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node [49.08777822540483]
Fast feedforward networks (FFFs) exploit the observation that different regions of the input space activate distinct subsets of neurons in wide networks.
We propose the incorporation of load balancing and Master Leaf techniques into the FFF architecture to improve performance and simplify the training process.
arXiv Detail & Related papers (2024-05-27T05:06:24Z) - Fractional Concepts in Neural Networks: Enhancing Activation Functions [0.6445605125467574]
This study integrates fractional calculus into neural networks by introducing fractional order derivatives (FDO) as tunable parameters in activation functions.
We evaluate these fractional activation functions on various datasets and network architectures, comparing their performance with traditional and new activation functions.
arXiv Detail & Related papers (2023-10-18T10:49:29Z) - Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust
Closed-Loop Control [63.310780486820796]
We show how a parameterization of recurrent connectivity influences robustness in closed-loop settings.
We find that closed-form continuous-time neural networks (CfCs) with fewer parameters can outperform their full-rank, fully-connected counterparts.
arXiv Detail & Related papers (2023-10-05T21:44:18Z) - Achieving Efficient Distributed Machine Learning Using a Novel
Non-Linear Class of Aggregation Functions [9.689867512720083]
Distributed machine learning (DML) over time-varying networks can be an enabler for emerging decentralized ML applications.
We propose a novel non-linear class of model aggregation functions to achieve efficient DML over time-varying networks.
arXiv Detail & Related papers (2022-01-29T03:43:26Z) - Federated Dynamic Sparse Training: Computing Less, Communicating Less,
Yet Learning Better [88.28293442298015]
Federated learning (FL) enables distribution of machine learning workloads from the cloud to resource-limited edge devices.
We develop, implement, and experimentally validate a novel FL framework termed Federated Dynamic Sparse Training (FedDST)
FedDST is a dynamic process that extracts and trains sparse sub-networks from the target full network.
arXiv Detail & Related papers (2021-12-18T02:26:38Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Activation functions are not needed: the ratio net [3.9636371287541086]
This paper focus on designing a new function approximator.
Instead of designing new activation functions or kernel functions, the new proposed network uses the fractional form.
It shows that, in most cases, the ratio net converges faster and outperforms both the classification and the RBF.
arXiv Detail & Related papers (2020-05-14T01:07:56Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.