Related papers: Trainable Compound Activation Functions for Machine Learning

Trainable Compound Activation Functions for Machine Learning

URL: http://arxiv.org/abs/2204.12920v1
Date: Mon, 25 Apr 2022 19:53:04 GMT
Title: Trainable Compound Activation Functions for Machine Learning
Authors: Paul M. Baggenstoss
Abstract summary: Activation functions (AF) are necessary components of neural networks that allow approximation of functions. We propose trainable compound AF (TCA) composed of a sum of shifted and scaled simple AFs.
Score: 13.554038901140949
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Activation functions (AF) are necessary components of neural networks that allow approximation of functions, but AFs in current use are usually simple monotonically increasing functions. In this paper, we propose trainable compound AF (TCA) composed of a sum of shifted and scaled simple AFs. TCAs increase the effectiveness of networks with fewer parameters compared to added layers. TCAs have a special interpretation in generative networks because they effectively estimate the marginal distributions of each dimension of the data using a mixture distribution, reducing modality and making linear dimension reduction more effective. When used in restricted Boltzmann machines (RBMs), they result in a novel type of RBM with mixture-based stochastic units. Improved performance is demonstrated in experiments using RBMs, deep belief networks (DBN), projected belief networks (PBN), and variational auto-encoders (VAE).

Related papers

JotlasNet: Joint Tensor Low-Rank and Attention-based Sparse Unrolling Network for Accelerating Dynamic MRI [6.081607038128913]
We propose a novel deep unrolling network, JotlasNet, for dynamic MRI reconstruction. Joint low-rank and sparse unrolling networks have shown superior performance in dynamic MRI reconstruction.
arXiv Detail & Related papers (2025-02-17T12:43:04Z)
Trainable Adaptive Activation Function Structure (TAAFS) Enhances Neural Network Force Field Performance with Only Dozens of Additional Parameters [0.0]
Trainable Adaptive Function Activation Structure (TAAFS) We introduce a method that selects distinct mathematical formulations for non-linear activations. In this study, we integrate TAAFS into a variety of neural network models, resulting in observed accuracy improvements.
arXiv Detail & Related papers (2024-12-19T09:06:39Z)
Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
NIDS Neural Networks Using Sliding Time Window Data Processing with Trainable Activations and its Generalization Capability [0.0]
This paper presents neural networks for network intrusion detection systems (NIDS) that operate on flow data preprocessed with a time window. It requires only eleven features which do not rely on deep packet inspection and can be found in most NIDS datasets and easily obtained from conventional flow collectors. The reported training accuracy exceeds 99% for the proposed method with as little as twenty neural network input features.
arXiv Detail & Related papers (2024-10-24T11:36:19Z)
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node [49.08777822540483]
Fast feedforward networks (FFFs) exploit the observation that different regions of the input space activate distinct subsets of neurons in wide networks. We propose the incorporation of load balancing and Master Leaf techniques into the FFF architecture to improve performance and simplify the training process.
arXiv Detail & Related papers (2024-05-27T05:06:24Z)
Fractional Concepts in Neural Networks: Enhancing Activation Functions [0.6445605125467574]
This study integrates fractional calculus into neural networks by introducing fractional order derivatives (FDO) as tunable parameters in activation functions. We evaluate these fractional activation functions on various datasets and network architectures, comparing their performance with traditional and new activation functions.
arXiv Detail & Related papers (2023-10-18T10:49:29Z)
Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust Closed-Loop Control [63.310780486820796]
We show how a parameterization of recurrent connectivity influences robustness in closed-loop settings. We find that closed-form continuous-time neural networks (CfCs) with fewer parameters can outperform their full-rank, fully-connected counterparts.
arXiv Detail & Related papers (2023-10-05T21:44:18Z)
Differentiable Neural Networks with RePU Activation: with Applications to Score Estimation and Isotonic Regression [7.450181695527364]
We study the properties of differentiable neural networks activated by rectified power unit (RePU) functions. We establish error bounds for simultaneously approximating $Cs$ smooth functions and their derivatives using RePU-activated deep neural networks.
arXiv Detail & Related papers (2023-05-01T00:09:48Z)
Achieving Efficient Distributed Machine Learning Using a Novel Non-Linear Class of Aggregation Functions [9.689867512720083]
Distributed machine learning (DML) over time-varying networks can be an enabler for emerging decentralized ML applications. We propose a novel non-linear class of model aggregation functions to achieve efficient DML over time-varying networks.
arXiv Detail & Related papers (2022-01-29T03:43:26Z)
Federated Dynamic Sparse Training: Computing Less, Communicating Less, Yet Learning Better [88.28293442298015]
Federated learning (FL) enables distribution of machine learning workloads from the cloud to resource-limited edge devices. We develop, implement, and experimentally validate a novel FL framework termed Federated Dynamic Sparse Training (FedDST) FedDST is a dynamic process that extracts and trains sparse sub-networks from the target full network.
arXiv Detail & Related papers (2021-12-18T02:26:38Z)
Diffusion Mechanism in Residual Neural Network: Theory and Applications [12.573746641284849]
In many learning tasks with limited training samples, the diffusion connects the labeled and unlabeled data points. We propose a novel diffusion residual network (Diff-ResNet) internally introduces diffusion into the architectures of neural networks. Under the structured data assumption, it is proved that the proposed diffusion block can increase the distance-diameter ratio that improves the separability of inter-class points.
arXiv Detail & Related papers (2021-05-07T10:42:59Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Activation functions are not needed: the ratio net [3.9636371287541086]
This paper focus on designing a new function approximator. Instead of designing new activation functions or kernel functions, the new proposed network uses the fractional form. It shows that, in most cases, the ratio net converges faster and outperforms both the classification and the RBF.
arXiv Detail & Related papers (2020-05-14T01:07:56Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
Training Deep Energy-Based Models with f-Divergence Minimization [113.97274898282343]
Deep energy-based models (EBMs) are very flexible in distribution parametrization but computationally challenging. We propose a general variational framework termed f-EBM to train EBMs using any desired f-divergence. Experimental results demonstrate the superiority of f-EBM over contrastive divergence, as well as the benefits of training EBMs using f-divergences other than KL.
arXiv Detail & Related papers (2020-03-06T23:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.