Related papers: Efficient Vectorized Backpropagation Algorithms for Training Feedforward Networks Composed of Quadratic Neurons

Efficient Vectorized Backpropagation Algorithms for Training Feedforward Networks Composed of Quadratic Neurons

URL: http://arxiv.org/abs/2310.02901v2
Date: Sat, 13 Jan 2024 19:22:19 GMT
Title: Efficient Vectorized Backpropagation Algorithms for Training Feedforward Networks Composed of Quadratic Neurons
Authors: Mathew Mithra Noel and Venkataraman Muthiah-Nakarajan
Abstract summary: This paper presents a solution to the XOR problem with a single quadratic neuron. It shows that any dataset composed of $mathcalC$ bounded clusters can be separated with only a single layer of $mathcalC$ quadratic neurons.
Score: 1.9580473532948401
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Higher order artificial neurons whose outputs are computed by applying an activation function to a higher order multinomial function of the inputs have been considered in the past, but did not gain acceptance due to the extra parameters and computational cost. However, higher order neurons have significantly greater learning capabilities since the decision boundaries of higher order neurons can be complex surfaces instead of just hyperplanes. The boundary of a single quadratic neuron can be a general hyper-quadric surface allowing it to learn many nonlinearly separable datasets. Since quadratic forms can be represented by symmetric matrices, only $\frac{n(n+1)}{2}$ additional parameters are needed instead of $n^2$. A quadratic Logistic regression model is first presented. Solutions to the XOR problem with a single quadratic neuron are considered. The complete vectorized equations for both forward and backward propagation in feedforward networks composed of quadratic neurons are derived. A reduced parameter quadratic neural network model with just $ n $ additional parameters per neuron that provides a compromise between learning ability and computational cost is presented. Comparison on benchmark classification datasets are used to demonstrate that a final layer of quadratic neurons enables networks to achieve higher accuracy with significantly fewer hidden layer neurons. In particular this paper shows that any dataset composed of $\mathcal{C}$ bounded clusters can be separated with only a single layer of $\mathcal{C}$ quadratic neurons.

Related papers

Stably unactivated neurons in ReLU neural networks [1.347660513756976]
In ReLU neural networks, the presence of stably unactivated neurons can reduce the network's expressiveness.<n>In this work, we investigate the probability of a neuron in the second hidden layer of such neural networks being stably unactivated.
arXiv Detail & Related papers (2024-12-06T22:15:22Z)
Interpolation with deep neural networks with non-polynomial activations: necessary and sufficient numbers of neurons [0.0]
We prove that $Theta(sqrtnd')$ neurons are sufficient as long as the activation function is real at a point and not a point and not a there. This means that activation functions can be freely chosen in a problem-dependent manner without loss of power.
arXiv Detail & Related papers (2024-05-22T15:29:45Z)
No One-Size-Fits-All Neurons: Task-based Neurons for Artificial Neural Networks [25.30801109401654]
Since the human brain is a task-based neuron user, can the artificial network design go from the task-based architecture design to the task-based neuron design? We propose a two-step framework for prototyping task-based neurons. Experiments show that the proposed task-based neuron design is not only feasible but also delivers competitive performance over other state-of-the-art models.
arXiv Detail & Related papers (2024-05-03T09:12:46Z)
PAON: A New Neuron Model using Padé Approximants [6.337675203577426]
Convolutional neural networks (CNN) are built upon the classical McCulloch-Pitts neuron model. We introduce a brand new neuron model called Pade neurons (Paons), inspired by the Pade approximants. Our experiments on the single-image super-resolution task show that PadeNets can obtain better results than competing architectures.
arXiv Detail & Related papers (2024-03-18T13:49:30Z)
One Neuron Saved Is One Neuron Earned: On Parametric Efficiency of Quadratic Networks [21.5187335186035]
We show that quadratic networks enjoy parametric efficiency, thereby confirming that the superior performance of quadratic networks is due to the intrinsic expressive capability. From the perspective of the Barron space, we demonstrate that there exists a functional space whose functions can be approximated by quadratic networks in a dimension-free error.
arXiv Detail & Related papers (2023-03-11T05:32:18Z)
Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language Understanding [82.46024259137823]
We propose a cross-model comparative loss for a broad range of tasks. We demonstrate the universal effectiveness of comparative loss through extensive experiments on 14 datasets from 3 distinct NLU tasks.
arXiv Detail & Related papers (2023-01-10T03:04:27Z)
Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware. Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks. We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence. Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z)
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning. In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function. Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z)
Towards Understanding Hierarchical Learning: Benefits of Neural Representations [160.33479656108926]
In this work, we demonstrate that intermediate neural representations add more flexibility to neural networks. We show that neural representation can achieve improved sample complexities compared with the raw input. Our results characterize when neural representations are beneficial, and may provide a new perspective on why depth is important in deep learning.
arXiv Detail & Related papers (2020-06-24T02:44:54Z)
Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy. We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.