On Expressivity and Trainability of Quadratic Networks
- URL: http://arxiv.org/abs/2110.06081v3
- Date: Sat, 9 Sep 2023 02:15:53 GMT
- Title: On Expressivity and Trainability of Quadratic Networks
- Authors: Feng-Lei Fan, Mengzhou Li, Fei Wang, Rongjie Lai, Ge Wang
- Abstract summary: quadratic artificial neurons can play an important role in deep learning models.
We show that the superior expressivity of a quadratic network over either a conventional network or a conventional network via quadratic activation is not fully elucidated.
We propose an effective training strategy referred to as ReLinear to stabilize the training process of a quadratic network.
- Score: 12.878230964137014
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Inspired by the diversity of biological neurons, quadratic artificial neurons
can play an important role in deep learning models. The type of quadratic
neurons of our interest replaces the inner-product operation in the
conventional neuron with a quadratic function. Despite promising results so far
achieved by networks of quadratic neurons, there are important issues not well
addressed. Theoretically, the superior expressivity of a quadratic network over
either a conventional network or a conventional network via quadratic
activation is not fully elucidated, which makes the use of quadratic networks
not well grounded. Practically, although a quadratic network can be trained via
generic backpropagation, it can be subject to a higher risk of collapse than
the conventional counterpart. To address these issues, we first apply the
spline theory and a measure from algebraic geometry to give two theorems that
demonstrate better model expressivity of a quadratic network than the
conventional counterpart with or without quadratic activation. Then, we propose
an effective training strategy referred to as ReLinear to stabilize the
training process of a quadratic network, thereby unleashing the full potential
in its associated machine learning tasks. Comprehensive experiments on popular
datasets are performed to support our findings and confirm the performance of
quadratic deep learning. We have shared our code in
\url{https://github.com/FengleiFan/ReLinear}.
Related papers
- LinSATNet: The Positive Linear Satisfiability Neural Networks [116.65291739666303]
This paper studies how to introduce the popular positive linear satisfiability to neural networks.
We propose the first differentiable satisfiability layer based on an extension of the classic Sinkhorn algorithm for jointly encoding multiple sets of marginal distributions.
arXiv Detail & Related papers (2024-07-18T22:05:21Z) - QuadraNet V2: Efficient and Sustainable Training of High-Order Neural Networks with Quadratic Adaptation [25.003305443114296]
We introduce a novel framework, QuadraNet V2, which leverages quadratic neural networks to create efficient high-order learning models.
Our method initializes the primary term of the quadratic neuron using a standard neural network, while the quadratic term is employed to adaptively enhance the learning of data non-linearity or shifts.
By utilizing existing pre-trained weights, QuadraNet V2 reduces the required GPU hours for training by 90% to 98.4% compared to training from scratch, demonstrating both efficiency and effectiveness.
arXiv Detail & Related papers (2024-05-06T06:31:47Z) - Efficient Vectorized Backpropagation Algorithms for Training Feedforward Networks Composed of Quadratic Neurons [1.6574413179773761]
This paper presents a solution to the XOR problem with a single quadratic neuron.
It shows that any dataset composed of $mathcalC$ bounded clusters can be separated with only a single layer of $mathcalC$ quadratic neurons.
arXiv Detail & Related papers (2023-10-04T15:39:57Z) - One Neuron Saved Is One Neuron Earned: On Parametric Efficiency of
Quadratic Networks [21.5187335186035]
We show that quadratic networks enjoy parametric efficiency, thereby confirming that the superior performance of quadratic networks is due to the intrinsic expressive capability.
From the perspective of the Barron space, we demonstrate that there exists a functional space whose functions can be approximated by quadratic networks in a dimension-free error.
arXiv Detail & Related papers (2023-03-11T05:32:18Z) - Attention-embedded Quadratic Network (Qttention) for Effective and
Interpretable Bearing Fault Diagnosis [0.31317409221921144]
Bearing fault diagnosis is of great importance to decrease the damage risk of rotating machines and further improve economic profits.
Recently, machine learning, represented by deep learning, has made great progress in bearing fault diagnosis.
Applying deep learning to such a task still faces two major problems.
arXiv Detail & Related papers (2022-06-01T10:51:01Z) - Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student
Settings and its Superiority to Kernel Methods [58.44819696433327]
We investigate the risk of two-layer ReLU neural networks in a teacher regression model.
We find that the student network provably outperforms any solution methods.
arXiv Detail & Related papers (2022-05-30T02:51:36Z) - Subquadratic Overparameterization for Shallow Neural Networks [60.721751363271146]
We provide an analytical framework that allows us to adopt standard neural training strategies.
We achieve the desiderata viaak-Lojasiewicz, smoothness, and standard assumptions.
arXiv Detail & Related papers (2021-11-02T20:24:01Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Provably Training Neural Network Classifiers under Fairness Constraints [70.64045590577318]
We show that overparametrized neural networks could meet the constraints.
Key ingredient of building a fair neural network classifier is establishing no-regret analysis for neural networks.
arXiv Detail & Related papers (2020-12-30T18:46:50Z) - Avoiding Spurious Local Minima in Deep Quadratic Networks [0.0]
We characterize the landscape of the mean squared nonlinear error for networks with neural activation functions.
We prove that deepized neural networks with quadratic activations benefit from similar landscape properties.
arXiv Detail & Related papers (2019-12-31T22:31:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.