Activation functions are not needed: the ratio net
- URL: http://arxiv.org/abs/2005.06678v2
- Date: Fri, 3 Dec 2021 06:22:27 GMT
- Title: Activation functions are not needed: the ratio net
- Authors: Chi-Chun Zhou, Hai-Long Tu, Yue-Jie Hou, Zhen Ling, Yi Liu, and Jian
Hua
- Abstract summary: This paper focus on designing a new function approximator.
Instead of designing new activation functions or kernel functions, the new proposed network uses the fractional form.
It shows that, in most cases, the ratio net converges faster and outperforms both the classification and the RBF.
- Score: 3.9636371287541086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A deep neural network for classification tasks is essentially consist of two
components: feature extractors and function approximators. They usually work as
an integrated whole, however, improvements on any components can promote the
performance of the whole algorithm. This paper focus on designing a new
function approximator. Conventionally, to build a function approximator, one
usually uses the method based on the nonlinear activation function or the
nonlinear kernel function and yields classical networks such as the
feed-forward neural network (MLP) and the radial basis function network (RBF).
In this paper, a new function approximator that is effective and efficient is
proposed. Instead of designing new activation functions or kernel functions,
the new proposed network uses the fractional form. For the sake of convenience,
we name the network the ratio net. We compare the effectiveness and efficiency
of the ratio net and that of the RBF and the MLP with various kinds of
activation functions in the classification task on the mnist database of
handwritten digits and the Internet Movie Database (IMDb) which is a binary
sentiment analysis dataset. It shows that, in most cases, the ratio net
converges faster and outperforms both the MLP and the RBF.
Related papers
- LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation [64.34935748707673]
Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors.
We propose a novel method of Learning Resampling (termed LeRF) which takes advantage of both the structural priors learned by DNNs and the locally continuous assumption.
LeRF assigns spatially varying resampling functions to input image pixels and learns to predict the shapes of these resampling functions with a neural network.
arXiv Detail & Related papers (2024-07-13T16:09:45Z) - Your Network May Need to Be Rewritten: Network Adversarial Based on High-Dimensional Function Graph Decomposition [0.994853090657971]
We propose a network adversarial method to address the aforementioned challenges.
This is the first method to use different activation functions in a network.
We have achieved a substantial improvement over standard activation functions regarding both training efficiency and predictive accuracy.
arXiv Detail & Related papers (2024-05-04T11:22:30Z) - OPAF: Optimized Secure Two-Party Computation Protocols for Nonlinear Activation Functions in Recurrent Neural Network [8.825150825838769]
This paper pays special attention to the implementation of non-linear functions in semi-honest model with two-party settings.
We propose a novel and efficient protocol for exponential function by using a divide-and-conquer strategy.
Next, we take advantage of the symmetry of sigmoid and Tanh, and fine-tune the inputs to reduce the 2PC building blocks.
arXiv Detail & Related papers (2024-03-01T02:49:40Z) - Provable Data Subset Selection For Efficient Neural Network Training [73.34254513162898]
We introduce the first algorithm to construct coresets for emphRBFNNs, i.e., small weighted subsets that approximate the loss of the input data on any radial basis function network.
We then perform empirical evaluations on function approximation and dataset subset selection on popular network architectures and data sets.
arXiv Detail & Related papers (2023-03-09T10:08:34Z) - Trainable Compound Activation Functions for Machine Learning [13.554038901140949]
Activation functions (AF) are necessary components of neural networks that allow approximation of functions.
We propose trainable compound AF (TCA) composed of a sum of shifted and scaled simple AFs.
arXiv Detail & Related papers (2022-04-25T19:53:04Z) - Graph-adaptive Rectified Linear Unit for Graph Neural Networks [64.92221119723048]
Graph Neural Networks (GNNs) have achieved remarkable success by extending traditional convolution to learning on non-Euclidean data.
We propose Graph-adaptive Rectified Linear Unit (GReLU) which is a new parametric activation function incorporating the neighborhood information in a novel and efficient way.
We conduct comprehensive experiments to show that our plug-and-play GReLU method is efficient and effective given different GNN backbones and various downstream tasks.
arXiv Detail & Related papers (2022-02-13T10:54:59Z) - Otimizacao de pesos e funcoes de ativacao de redes neurais aplicadas na
previsao de series temporais [0.0]
We propose the use of a family of free parameter asymmetric activation functions for neural networks.
We show that this family of defined activation functions satisfies the requirements of the universal approximation theorem.
A methodology for the global optimization of this family of activation functions with free parameter and the weights of the connections between the processing units of the neural network is used.
arXiv Detail & Related papers (2021-07-29T23:32:15Z) - Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks.
Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z) - Learning specialized activation functions with the Piecewise Linear Unit [7.820667552233989]
We propose a new activation function called Piecewise Linear Unit(PWLU), which incorporates a carefully designed formulation and learning method.
It can learn specialized activation functions and achieves SOTA performance on large-scale datasets like ImageNet and COCO.
PWLU is also easy to implement and efficient at inference, which can be widely applied in real-world applications.
arXiv Detail & Related papers (2021-04-08T11:29:11Z) - The Connection Between Approximation, Depth Separation and Learnability
in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity.
We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.