Efficient Vectorized Backpropagation Algorithms for Training Feedforward
Networks Composed of Quadratic Neurons
- URL: http://arxiv.org/abs/2310.02901v2
- Date: Sat, 13 Jan 2024 19:22:19 GMT
- Title: Efficient Vectorized Backpropagation Algorithms for Training Feedforward
Networks Composed of Quadratic Neurons
- Authors: Mathew Mithra Noel and Venkataraman Muthiah-Nakarajan
- Abstract summary: This paper presents a solution to the XOR problem with a single quadratic neuron.
It shows that any dataset composed of $mathcalC$ bounded clusters can be separated with only a single layer of $mathcalC$ quadratic neurons.
- Score: 1.9580473532948401
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Higher order artificial neurons whose outputs are computed by applying an
activation function to a higher order multinomial function of the inputs have
been considered in the past, but did not gain acceptance due to the extra
parameters and computational cost. However, higher order neurons have
significantly greater learning capabilities since the decision boundaries of
higher order neurons can be complex surfaces instead of just hyperplanes. The
boundary of a single quadratic neuron can be a general hyper-quadric surface
allowing it to learn many nonlinearly separable datasets. Since quadratic forms
can be represented by symmetric matrices, only $\frac{n(n+1)}{2}$ additional
parameters are needed instead of $n^2$. A quadratic Logistic regression model
is first presented. Solutions to the XOR problem with a single quadratic neuron
are considered. The complete vectorized equations for both forward and backward
propagation in feedforward networks composed of quadratic neurons are derived.
A reduced parameter quadratic neural network model with just $ n $ additional
parameters per neuron that provides a compromise between learning ability and
computational cost is presented. Comparison on benchmark classification
datasets are used to demonstrate that a final layer of quadratic neurons
enables networks to achieve higher accuracy with significantly fewer hidden
layer neurons. In particular this paper shows that any dataset composed of
$\mathcal{C}$ bounded clusters can be separated with only a single layer of
$\mathcal{C}$ quadratic neurons.
Related papers
- No One-Size-Fits-All Neurons: Task-based Neurons for Artificial Neural Networks [25.30801109401654]
Since the human brain is a task-based neuron user, can the artificial network design go from the task-based architecture design to the task-based neuron design?
We propose a two-step framework for prototyping task-based neurons.
Experiments show that the proposed task-based neuron design is not only feasible but also delivers competitive performance over other state-of-the-art models.
arXiv Detail & Related papers (2024-05-03T09:12:46Z) - PAON: A New Neuron Model using Padé Approximants [6.337675203577426]
Convolutional neural networks (CNN) are built upon the classical McCulloch-Pitts neuron model.
We introduce a brand new neuron model called Pade neurons (Paons), inspired by the Pade approximants.
Our experiments on the single-image super-resolution task show that PadeNets can obtain better results than competing architectures.
arXiv Detail & Related papers (2024-03-18T13:49:30Z) - One Neuron Saved Is One Neuron Earned: On Parametric Efficiency of
Quadratic Networks [21.5187335186035]
We show that quadratic networks enjoy parametric efficiency, thereby confirming that the superior performance of quadratic networks is due to the intrinsic expressive capability.
From the perspective of the Barron space, we demonstrate that there exists a functional space whose functions can be approximated by quadratic networks in a dimension-free error.
arXiv Detail & Related papers (2023-03-11T05:32:18Z) - Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language
Understanding [82.46024259137823]
We propose a cross-model comparative loss for a broad range of tasks.
We demonstrate the universal effectiveness of comparative loss through extensive experiments on 14 datasets from 3 distinct NLU tasks.
arXiv Detail & Related papers (2023-01-10T03:04:27Z) - On Expressivity and Trainability of Quadratic Networks [12.878230964137014]
quadratic artificial neurons can play an important role in deep learning models.
We show that the superior expressivity of a quadratic network over either a conventional network or a conventional network via quadratic activation is not fully elucidated.
We propose an effective training strategy referred to as ReLinear to stabilize the training process of a quadratic network.
arXiv Detail & Related papers (2021-10-12T15:33:32Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - On Function Approximation in Reinforcement Learning: Optimism in the
Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning.
In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function.
Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z) - Towards Understanding Hierarchical Learning: Benefits of Neural
Representations [160.33479656108926]
In this work, we demonstrate that intermediate neural representations add more flexibility to neural networks.
We show that neural representation can achieve improved sample complexities compared with the raw input.
Our results characterize when neural representations are beneficial, and may provide a new perspective on why depth is important in deep learning.
arXiv Detail & Related papers (2020-06-24T02:44:54Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.