Generalized Activation via Multivariate Projection
- URL: http://arxiv.org/abs/2309.17194v2
- Date: Sat, 27 Jan 2024 09:50:08 GMT
- Title: Generalized Activation via Multivariate Projection
- Authors: Jiayun Li, Yuxiao Cheng, Yiwen Lu, Zhuofan Xia, Yilin Mo, Gao Huang
- Abstract summary: Activation functions are essential to introduce nonlinearity into neural networks.
We consider ReLU as a projection from R onto the nonnegative half-line R+.
We extend ReLU by substituting it with a generalized projection operator onto a convex cone, such as the Second-Order Cone (SOC) projection.
- Score: 46.837481855573145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Activation functions are essential to introduce nonlinearity into neural
networks, with the Rectified Linear Unit (ReLU) often favored for its
simplicity and effectiveness. Motivated by the structural similarity between a
shallow Feedforward Neural Network (FNN) and a single iteration of the
Projected Gradient Descent (PGD) algorithm, a standard approach for solving
constrained optimization problems, we consider ReLU as a projection from R onto
the nonnegative half-line R+. Building on this interpretation, we extend ReLU
by substituting it with a generalized projection operator onto a convex cone,
such as the Second-Order Cone (SOC) projection, thereby naturally extending it
to a Multivariate Projection Unit (MPU), an activation function with multiple
inputs and multiple outputs. We further provide mathematical proof establishing
that FNNs activated by SOC projections outperform those utilizing ReLU in terms
of expressive power. Experimental evaluations on widely-adopted architectures
further corroborate MPU's effectiveness against a broader range of existing
activation functions.
Related papers
- Improving the Expressive Power of Deep Neural Networks through Integral
Activation Transform [12.36064367319084]
We generalize the traditional fully connected deep neural network (DNN) through the concept of continuous width.
We show that IAT-ReLU exhibits a continuous activation pattern when continuous basis functions are employed.
Our numerical experiments demonstrate that IAT-ReLU outperforms regular ReLU in terms of trainability and better smoothness.
arXiv Detail & Related papers (2023-12-19T20:23:33Z) - Non-stationary Reinforcement Learning under General Function
Approximation [60.430936031067006]
We first propose a new complexity metric called dynamic Bellman Eluder (DBE) dimension for non-stationary MDPs.
Based on the proposed complexity metric, we propose a novel confidence-set based model-free algorithm called SW-OPEA.
We show that SW-OPEA is provably efficient as long as the variation budget is not significantly large.
arXiv Detail & Related papers (2023-06-01T16:19:37Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Revisiting GANs by Best-Response Constraint: Perspective, Methodology,
and Application [49.66088514485446]
Best-Response Constraint (BRC) is a general learning framework to explicitly formulate the potential dependency of the generator on the discriminator.
We show that even with different motivations and formulations, a variety of existing GANs ALL can be uniformly improved by our flexible BRC methodology.
arXiv Detail & Related papers (2022-05-20T12:42:41Z) - Multi-Head ReLU Implicit Neural Representation Networks [3.04585143845864]
A novel multi-head multi-layer perceptron (MLP) structure is presented for implicit neural representation (INR)
We show that the proposed model does not suffer from the special bias of conventional ReLU networks and has superior capabilities.
arXiv Detail & Related papers (2021-10-07T13:27:35Z) - Neural Spectrahedra and Semidefinite Lifts: Global Convex Optimization
of Polynomial Activation Neural Networks in Fully Polynomial-Time [31.94590517036704]
We develop exact convex optimization formulations for two-layer numerical networks with second degree activations.
We show that semidefinite neural and therefore global optimization is in complexity dimension and sample size for all input data.
The proposed approach is significantly faster to obtain better test accuracy compared to the standard backpropagation procedure.
arXiv Detail & Related papers (2021-01-07T08:43:01Z) - Iterative Algorithm Induced Deep-Unfolding Neural Networks: Precoding
Design for Multiuser MIMO Systems [59.804810122136345]
We propose a framework for deep-unfolding, where a general form of iterative algorithm induced deep-unfolding neural network (IAIDNN) is developed.
An efficient IAIDNN based on the structure of the classic weighted minimum mean-square error (WMMSE) iterative algorithm is developed.
We show that the proposed IAIDNN efficiently achieves the performance of the iterative WMMSE algorithm with reduced computational complexity.
arXiv Detail & Related papers (2020-06-15T02:57:57Z) - Iterative Network for Image Super-Resolution [69.07361550998318]
Single image super-resolution (SISR) has been greatly revitalized by the recent development of convolutional neural networks (CNN)
This paper provides a new insight on conventional SISR algorithm, and proposes a substantially different approach relying on the iterative optimization.
A novel iterative super-resolution network (ISRN) is proposed on top of the iterative optimization.
arXiv Detail & Related papers (2020-05-20T11:11:47Z) - Multivariate Functional Regression via Nested Reduced-Rank
Regularization [2.730097437607271]
We propose a nested reduced-rank regression (NRRR) approach in fitting regression model with multivariate functional responses and predictors.
We show through non-asymptotic analysis that NRRR can achieve at least a comparable error rate to that of the reduced-rank regression.
We apply NRRR in an electricity demand problem, to relate the trajectories of the daily electricity consumption with those of the daily temperatures.
arXiv Detail & Related papers (2020-03-10T14:58:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.