Related papers: QuadEnhancer: Leveraging Quadratic Transformations to Enhance Deep Neural Networks

QuadEnhancer: Leveraging Quadratic Transformations to Enhance Deep Neural Networks

URL: http://arxiv.org/abs/2510.03276v1
Date: Sun, 28 Sep 2025 08:35:31 GMT
Title: QuadEnhancer: Leveraging Quadratic Transformations to Enhance Deep Neural Networks
Authors: Qian Chen, Linxin Yang, Akang Wang, Xiaodong Luo, Yin Zhang,
Abstract summary: This paper explores the introduction of quadratic transformations to further increase nonlinearity in neural networks.<n>We propose a lightweight quadratic enhancer that uses low-rankness, weight sharing, and sparsification techniques.<n>We conduct a set of proof-of-concept experiments for the proposed method across three tasks: image classification, text classification, and fine-tuning large-language models.
Score: 11.940590491663682
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The combination of linear transformations and non-linear activation functions forms the foundation of most modern deep neural networks, enabling them to approximate highly complex functions. This paper explores the introduction of quadratic transformations to further increase nonlinearity in neural networks, with the aim of enhancing the performance of existing architectures. To reduce parameter complexity and computational complexity, we propose a lightweight quadratic enhancer that uses low-rankness, weight sharing, and sparsification techniques. For a fixed architecture, the proposed approach introduces quadratic interactions between features at every layer, while only adding negligible amounts of additional model parameters and forward computations. We conduct a set of proof-of-concept experiments for the proposed method across three tasks: image classification, text classification, and fine-tuning large-language models. In all tasks, the proposed approach demonstrates clear and substantial performance gains.

Related papers

Weights initialization of neural networks for function approximation [0.9099663022952497]
Neural network-based function approximation plays a pivotal role in the advancement of scientific computing and machine learning.<n>We propose a reusable framework based on on basis function pretraining.<n>In this approach, basis neural networks are first trained to approximate families of structural correspondences on a reference domain.<n>Their learned parameters are then used to initialize networks for more complex target functions.
arXiv Detail & Related papers (2025-10-09T19:56:26Z)
MixFunn: A Neural Network for Differential Equations with Improved Generalization and Interpretability [0.0]
MixFunn is a novel neural network architecture designed to solve differential equations with enhanced precision, interpretability, and generalization capability.<n>The architecture comprises two key components: the mixed-function neuron, which integrates multiple parameterized nonlinear functions, and the second-order neuron, which combines a linear transformation of its inputs with a quadratic term to capture cross-combinations of input variables.
arXiv Detail & Related papers (2025-03-28T15:31:15Z)
Generalized Factor Neural Network Model for High-dimensional Regression [50.554377879576066]
We tackle the challenges of modeling high-dimensional data sets with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships.<n>Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression.
arXiv Detail & Related papers (2025-02-16T23:13:55Z)
A Survey on Kolmogorov-Arnold Network [0.0]
Review explores the theoretical foundations, evolution, applications, and future potential of Kolmogorov-Arnold Networks (KAN) KANs distinguish themselves from traditional neural networks by using learnable, spline- parameterized functions instead of fixed activation functions. This paper highlights KAN's role in modern neural architectures and outlines future directions to improve its computational efficiency, interpretability, and scalability in data-intensive applications.
arXiv Detail & Related papers (2024-11-09T05:54:17Z)
Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network. Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Advancing Neural Network Performance through Emergence-Promoting Initialization Scheme [0.0]
Emergence in machine learning refers to the spontaneous appearance of capabilities that arise from the scale and structure of training data.<n>We introduce a novel yet straightforward neural network initialization scheme that aims at achieving greater potential for emergence.<n>We demonstrate substantial improvements in both model accuracy and training speed, with and without batch normalization.
arXiv Detail & Related papers (2024-07-26T18:56:47Z)
GFN: A graph feedforward network for resolution-invariant reduced operator learning in multifidelity applications [0.0]
This work presents a novel resolution-invariant model order reduction strategy for multifidelity applications. We base our architecture on a novel neural network layer developed in this work, the graph feedforward network. We exploit the method's capability of training and testing on different mesh sizes in an autoencoder-based reduction strategy for parametrised partial differential equations.
arXiv Detail & Related papers (2024-06-05T18:31:37Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training. We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z)
Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations [21.64457003420851]
We develop a hybrid neural surface representation that allows us to impose geometry-aware sampling and regularization. We demonstrate that our method can be adopted to improve techniques for reconstructing neural implicit surfaces from multi-view images or point clouds.
arXiv Detail & Related papers (2020-12-11T15:51:04Z)
Deep Magnification-Flexible Upsampling over 3D Point Clouds [103.09504572409449]
We propose a novel end-to-end learning-based framework to generate dense point clouds. We first formulate the problem explicitly, which boils down to determining the weights and high-order approximation errors. Then, we design a lightweight neural network to adaptively learn unified and sorted weights as well as the high-order refinements.
arXiv Detail & Related papers (2020-11-25T14:00:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.