Rational ANOVA Networks
- URL: http://arxiv.org/abs/2602.04006v1
- Date: Tue, 03 Feb 2026 20:46:00 GMT
- Title: Rational ANOVA Networks
- Authors: Jusheng Zhang, Ningyuan Liu, Qinhan Lyu, Jing Yang, Keze Wang,
- Abstract summary: Deep neural networks treat nonlinearities as fixed primitives, limiting both interpretability and control over the induced function class.<n>We propose the Rational-ANOVA Network (RAN), a foundational architecture grounded in functional ANOVA decomposition and Pad-style rational approximation.
- Score: 12.085684184678348
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks typically treat nonlinearities as fixed primitives (e.g., ReLU), limiting both interpretability and the granularity of control over the induced function class. While recent additive models (like KANs) attempt to address this using splines, they often suffer from computational inefficiency and boundary instability. We propose the Rational-ANOVA Network (RAN), a foundational architecture grounded in functional ANOVA decomposition and Padé-style rational approximation. RAN models f(x) as a composition of main effects and sparse pairwise interactions, where each component is parameterized by a stable, learnable rational unit. Crucially, we enforce a strictly positive denominator, which avoids poles and numerical instability while capturing sharp transitions and near-singular behaviors more efficiently than polynomial bases. This ANOVA structure provides an explicit low-order interaction bias for data efficiency and interpretability, while the rational parameterization significantly improves extrapolation. Across controlled function benchmarks and vision classification tasks (e.g., CIFAR-10) under matched parameter and compute budgets, RAN matches or surpasses parameter-matched MLPs and learnable-activation baselines, with better stability and throughput. Code is available at https://github.com/jushengzhang/Rational-ANOVA-Networks.git.
Related papers
- Kolmogorov Arnold Networks and Multi-Layer Perceptrons: A Paradigm Shift in Neural Modelling [1.6998720690708842]
The research undertakes a comprehensive comparative analysis of Kolmogorov-Arnold Networks (KAN) and Multi-Layer Perceptrons (MLP)<n>KANs utilize spline-based activation functions and grid-based structures, providing a transformative approach compared to traditional neural network frameworks.<n>The proposed study highlights the transformative capabilities of KANs in progressing intelligent systems.
arXiv Detail & Related papers (2026-01-15T16:26:49Z) - Efficiency vs. Fidelity: A Comparative Analysis of Diffusion Probabilistic Models and Flow Matching on Low-Resource Hardware [0.0]
Denoising Diffusion Probabilistic Models (DDPMs) have established a new state-of-the-art in generative image synthesis.<n>This study presents a comparative analysis of DDPMs against the emerging Flow Matching paradigm.
arXiv Detail & Related papers (2025-11-24T18:19:42Z) - Deep Hierarchical Learning with Nested Subspace Networks [53.71337604556311]
We propose Nested Subspace Networks (NSNs) for large neural networks.<n>NSNs enable a single model to be dynamically and granularly adjusted across a continuous spectrum of compute budgets.<n>We show that NSNs can be surgically applied to pre-trained LLMs and unlock a smooth and predictable compute-performance frontier.
arXiv Detail & Related papers (2025-09-22T15:13:14Z) - Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.<n>Non-smooth regularization is often incorporated into machine learning tasks.<n>We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z) - Free-Knots Kolmogorov-Arnold Network: On the Analysis of Spline Knots and Advancing Stability [16.957071012748454]
Kolmogorov-Arnold Neural Networks (KANs) have gained significant attention in the machine learning community.<n>However, their implementation often suffers from poor training stability and heavy trainable parameter.<n>In this work, we analyze the behavior of KANs through the lens of spline knots and derive the lower and upper bound for the number of knots in B-spline-based KANs.
arXiv Detail & Related papers (2025-01-16T04:12:05Z) - CoRMF: Criticality-Ordered Recurrent Mean Field Ising Solver [4.364088891019632]
We propose an RNN-based efficient Ising model solver, the Criticality-ordered Recurrent Mean Field (CoRMF)
By leveraging the approximated tree structure of the underlying Ising graph, the newly-obtained criticality order enables the unification between variational mean-field and RNN.
CoRFM solves the Ising problems in a self-train fashion without data/evidence, and the inference tasks can be executed by directly sampling from RNN.
arXiv Detail & Related papers (2024-03-05T16:55:06Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption.
They can suffer from ill-posedness and convergence instability.
This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.