Trainable Compound Activation Functions for Machine Learning
- URL: http://arxiv.org/abs/2204.12920v1
- Date: Mon, 25 Apr 2022 19:53:04 GMT
- Title: Trainable Compound Activation Functions for Machine Learning
- Authors: Paul M. Baggenstoss
- Abstract summary: Activation functions (AF) are necessary components of neural networks that allow approximation of functions.
We propose trainable compound AF (TCA) composed of a sum of shifted and scaled simple AFs.
- Score: 13.554038901140949
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Activation functions (AF) are necessary components of neural networks that
allow approximation of functions, but AFs in current use are usually simple
monotonically increasing functions. In this paper, we propose trainable
compound AF (TCA) composed of a sum of shifted and scaled simple AFs. TCAs
increase the effectiveness of networks with fewer parameters compared to added
layers. TCAs have a special interpretation in generative networks because they
effectively estimate the marginal distributions of each dimension of the data
using a mixture distribution, reducing modality and making linear dimension
reduction more effective. When used in restricted Boltzmann machines (RBMs),
they result in a novel type of RBM with mixture-based stochastic units.
Improved performance is demonstrated in experiments using RBMs, deep belief
networks (DBN), projected belief networks (PBN), and variational auto-encoders
(VAE).
Related papers
- NIDS Neural Networks Using Sliding Time Window Data Processing with Trainable Activations and its Generalization Capability [0.0]
This paper presents neural networks for network intrusion detection systems (NIDS) that operate on flow data preprocessed with a time window.
It requires only eleven features which do not rely on deep packet inspection and can be found in most NIDS datasets and easily obtained from conventional flow collectors.
The reported training accuracy exceeds 99% for the proposed method with as little as twenty neural network input features.
arXiv Detail & Related papers (2024-10-24T11:36:19Z) - Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node [49.08777822540483]
Fast feedforward networks (FFFs) exploit the observation that different regions of the input space activate distinct subsets of neurons in wide networks.
We propose the incorporation of load balancing and Master Leaf techniques into the FFF architecture to improve performance and simplify the training process.
arXiv Detail & Related papers (2024-05-27T05:06:24Z) - Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust
Closed-Loop Control [63.310780486820796]
We show how a parameterization of recurrent connectivity influences robustness in closed-loop settings.
We find that closed-form continuous-time neural networks (CfCs) with fewer parameters can outperform their full-rank, fully-connected counterparts.
arXiv Detail & Related papers (2023-10-05T21:44:18Z) - Differentiable Neural Networks with RePU Activation: with Applications to Score Estimation and Isotonic Regression [7.450181695527364]
We study the properties of differentiable neural networks activated by rectified power unit (RePU) functions.
We establish error bounds for simultaneously approximating $Cs$ smooth functions and their derivatives using RePU-activated deep neural networks.
arXiv Detail & Related papers (2023-05-01T00:09:48Z) - Achieving Efficient Distributed Machine Learning Using a Novel
Non-Linear Class of Aggregation Functions [9.689867512720083]
Distributed machine learning (DML) over time-varying networks can be an enabler for emerging decentralized ML applications.
We propose a novel non-linear class of model aggregation functions to achieve efficient DML over time-varying networks.
arXiv Detail & Related papers (2022-01-29T03:43:26Z) - Federated Dynamic Sparse Training: Computing Less, Communicating Less,
Yet Learning Better [88.28293442298015]
Federated learning (FL) enables distribution of machine learning workloads from the cloud to resource-limited edge devices.
We develop, implement, and experimentally validate a novel FL framework termed Federated Dynamic Sparse Training (FedDST)
FedDST is a dynamic process that extracts and trains sparse sub-networks from the target full network.
arXiv Detail & Related papers (2021-12-18T02:26:38Z) - Diffusion Mechanism in Residual Neural Network: Theory and Applications [12.573746641284849]
In many learning tasks with limited training samples, the diffusion connects the labeled and unlabeled data points.
We propose a novel diffusion residual network (Diff-ResNet) internally introduces diffusion into the architectures of neural networks.
Under the structured data assumption, it is proved that the proposed diffusion block can increase the distance-diameter ratio that improves the separability of inter-class points.
arXiv Detail & Related papers (2021-05-07T10:42:59Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Activation functions are not needed: the ratio net [3.9636371287541086]
This paper focus on designing a new function approximator.
Instead of designing new activation functions or kernel functions, the new proposed network uses the fractional form.
It shows that, in most cases, the ratio net converges faster and outperforms both the classification and the RBF.
arXiv Detail & Related papers (2020-05-14T01:07:56Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Training Deep Energy-Based Models with f-Divergence Minimization [113.97274898282343]
Deep energy-based models (EBMs) are very flexible in distribution parametrization but computationally challenging.
We propose a general variational framework termed f-EBM to train EBMs using any desired f-divergence.
Experimental results demonstrate the superiority of f-EBM over contrastive divergence, as well as the benefits of training EBMs using f-divergences other than KL.
arXiv Detail & Related papers (2020-03-06T23:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.