Generalized Activation via Multivariate Projection
- URL: http://arxiv.org/abs/2309.17194v2
- Date: Sat, 27 Jan 2024 09:50:08 GMT
- Title: Generalized Activation via Multivariate Projection
- Authors: Jiayun Li, Yuxiao Cheng, Yiwen Lu, Zhuofan Xia, Yilin Mo, Gao Huang
- Abstract summary: Activation functions are essential to introduce nonlinearity into neural networks.
We consider ReLU as a projection from R onto the nonnegative half-line R+.
We extend ReLU by substituting it with a generalized projection operator onto a convex cone, such as the Second-Order Cone (SOC) projection.
- Score: 46.837481855573145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Activation functions are essential to introduce nonlinearity into neural
networks, with the Rectified Linear Unit (ReLU) often favored for its
simplicity and effectiveness. Motivated by the structural similarity between a
shallow Feedforward Neural Network (FNN) and a single iteration of the
Projected Gradient Descent (PGD) algorithm, a standard approach for solving
constrained optimization problems, we consider ReLU as a projection from R onto
the nonnegative half-line R+. Building on this interpretation, we extend ReLU
by substituting it with a generalized projection operator onto a convex cone,
such as the Second-Order Cone (SOC) projection, thereby naturally extending it
to a Multivariate Projection Unit (MPU), an activation function with multiple
inputs and multiple outputs. We further provide mathematical proof establishing
that FNNs activated by SOC projections outperform those utilizing ReLU in terms
of expressive power. Experimental evaluations on widely-adopted architectures
further corroborate MPU's effectiveness against a broader range of existing
activation functions.
Related papers
- Hysteresis Activation Function for Efficient Inference [3.5223695602582614]
We propose a Hysteresis Rectified Linear Unit (HeLU) to address the dying ReLU'' problem with minimal complexity.
Unlike traditional activation functions with fixed thresholds for training and inference, HeLU employs a variable threshold that refines the backpropagation.
arXiv Detail & Related papers (2024-11-15T20:46:58Z) - A simple algorithm for output range analysis for deep neural networks [0.0]
This paper presents a novel approach for the output range estimation problem in Deep Neural Networks (DNNs) by integrating a Simulated Annealing (SA) algorithm.
The method effectively addresses the challenges by the lack of geometric information and non-linearity inherent inResNets.
arXiv Detail & Related papers (2024-07-02T22:47:40Z) - Improving the Expressive Power of Deep Neural Networks through Integral
Activation Transform [12.36064367319084]
We generalize the traditional fully connected deep neural network (DNN) through the concept of continuous width.
We show that IAT-ReLU exhibits a continuous activation pattern when continuous basis functions are employed.
Our numerical experiments demonstrate that IAT-ReLU outperforms regular ReLU in terms of trainability and better smoothness.
arXiv Detail & Related papers (2023-12-19T20:23:33Z) - Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors.
LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task.
Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - Non-stationary Reinforcement Learning under General Function
Approximation [60.430936031067006]
We first propose a new complexity metric called dynamic Bellman Eluder (DBE) dimension for non-stationary MDPs.
Based on the proposed complexity metric, we propose a novel confidence-set based model-free algorithm called SW-OPEA.
We show that SW-OPEA is provably efficient as long as the variation budget is not significantly large.
arXiv Detail & Related papers (2023-06-01T16:19:37Z) - Revisiting GANs by Best-Response Constraint: Perspective, Methodology,
and Application [49.66088514485446]
Best-Response Constraint (BRC) is a general learning framework to explicitly formulate the potential dependency of the generator on the discriminator.
We show that even with different motivations and formulations, a variety of existing GANs ALL can be uniformly improved by our flexible BRC methodology.
arXiv Detail & Related papers (2022-05-20T12:42:41Z) - Multi-Head ReLU Implicit Neural Representation Networks [3.04585143845864]
A novel multi-head multi-layer perceptron (MLP) structure is presented for implicit neural representation (INR)
We show that the proposed model does not suffer from the special bias of conventional ReLU networks and has superior capabilities.
arXiv Detail & Related papers (2021-10-07T13:27:35Z) - Iterative Algorithm Induced Deep-Unfolding Neural Networks: Precoding
Design for Multiuser MIMO Systems [59.804810122136345]
We propose a framework for deep-unfolding, where a general form of iterative algorithm induced deep-unfolding neural network (IAIDNN) is developed.
An efficient IAIDNN based on the structure of the classic weighted minimum mean-square error (WMMSE) iterative algorithm is developed.
We show that the proposed IAIDNN efficiently achieves the performance of the iterative WMMSE algorithm with reduced computational complexity.
arXiv Detail & Related papers (2020-06-15T02:57:57Z) - Iterative Network for Image Super-Resolution [69.07361550998318]
Single image super-resolution (SISR) has been greatly revitalized by the recent development of convolutional neural networks (CNN)
This paper provides a new insight on conventional SISR algorithm, and proposes a substantially different approach relying on the iterative optimization.
A novel iterative super-resolution network (ISRN) is proposed on top of the iterative optimization.
arXiv Detail & Related papers (2020-05-20T11:11:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.