Hardware-Friendly Input Expansion for Accelerating Function Approximation
- URL: http://arxiv.org/abs/2602.17952v1
- Date: Fri, 20 Feb 2026 03:07:05 GMT
- Title: Hardware-Friendly Input Expansion for Accelerating Function Approximation
- Authors: Hu Lou, Yin-Jun Gao, Dong-Xiao Zhang, Tai-Jiao Du, Jun-Jie Zhang, Jia-Rui Zhang,
- Abstract summary: One-dimensional function approximation is a fundamental problem in scientific computing and engineering applications.<n>This paper proposes a hardware-friendly approach for function approximation through emphinput-space expansion.<n>We evaluate the method on ten representative one-dimensional functions, including smooth, discontinuous, high-frequency, and non-differentiable functions.
- Score: 7.368108776065735
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: One-dimensional function approximation is a fundamental problem in scientific computing and engineering applications. While neural networks possess powerful universal approximation capabilities, their optimization process is often hindered by flat loss landscapes induced by parameter-space symmetries, leading to slow convergence and poor generalization, particularly for high-frequency components. Inspired by the principle of \emph{symmetry breaking} in physics, this paper proposes a hardware-friendly approach for function approximation through \emph{input-space expansion}. The core idea involves augmenting the original one-dimensional input (e.g., $x$) with constant values (e.g., $π$) to form a higher-dimensional vector (e.g., $[π, π, x, π, π]$), effectively breaking parameter symmetries without increasing the network's parameter count. We evaluate the method on ten representative one-dimensional functions, including smooth, discontinuous, high-frequency, and non-differentiable functions. Experimental results demonstrate that input-space expansion significantly accelerates training convergence (reducing LBFGS iterations by 12\% on average) and enhances approximation accuracy (reducing final MSE by 66.3\% for the optimal 5D expansion). Ablation studies further reveal the effects of different expansion dimensions and constant selections, with $π$ consistently outperforming other constants. Our work proposes a low-cost, efficient, and hardware-friendly technique for algorithm design.
Related papers
- Gradient Descent as a Perceptron Algorithm: Understanding Dynamics and Implicit Acceleration [67.12978375116599]
We show that the steps of gradient descent (GD) reduce to those of generalized perceptron algorithms.<n>This helps explain the optimization dynamics and the implicit acceleration phenomenon observed in neural networks.
arXiv Detail & Related papers (2025-12-12T14:16:35Z) - Estimation of Toeplitz Covariance Matrices using Overparameterized Gradient Descent [1.7188280334580195]
We revisit Toeplitz covariance estimation through the lens of simple descent (GD)<n>We show that when $K = P$, GD may converge to suboptimal solutions.<n>We propose an accelerated GD variant with separate learning rates for amplitudes and frequencies.
arXiv Detail & Related papers (2025-11-03T14:07:53Z) - Improved Stochastic Optimization of LogSumExp [2.8547553943343797]
We propose a novel approximation to LogSumExp that can be efficiently optimized using gradient methods.<n>The accuracy of the approximation is controlled by a tunable parameter and can be made arbitrarily small.<n> Experiments in DRO and continuous optimal transport demonstrate the advantages of our approach.
arXiv Detail & Related papers (2025-09-29T15:03:55Z) - Preconditioned Additive Gaussian Processes with Fourier Acceleration [2.292881746604941]
We introduce a matrix-free method to achieve nearly linear complexity in the multiplication of kernel matrices and their derivatives.<n>To address high-dimensional problems, we propose an additive kernel approach.<n>Each sub- Kernel captures lower-order feature interactions, allowing for the efficient application of the NFFT method.
arXiv Detail & Related papers (2025-04-01T07:14:06Z) - TensorGRaD: Tensor Gradient Robust Decomposition for Memory-Efficient Neural Operator Training [91.8932638236073]
We introduce textbfTensorGRaD, a novel method that directly addresses the memory challenges associated with large-structured weights.<n>We show that sparseGRaD reduces total memory usage by over $50%$ while maintaining and sometimes even improving accuracy.
arXiv Detail & Related papers (2025-01-04T20:51:51Z) - A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - Geometry-induced Implicit Regularization in Deep ReLU Neural Networks [0.0]
Implicit regularization phenomena, which are still not well understood, occur during optimization.
We study the geometry of the output set as parameters vary.
We prove that the batch functional dimension is almost surely determined by the activation patterns in the hidden layers.
arXiv Detail & Related papers (2024-02-13T07:49:57Z) - On Convergence of Incremental Gradient for Non-Convex Smooth Functions [63.51187646914962]
In machine learning and network optimization, algorithms like shuffle SGD are popular due to minimizing the number of misses and good cache.
This paper delves into the convergence properties SGD algorithms with arbitrary data ordering.
arXiv Detail & Related papers (2023-05-30T17:47:27Z) - Reducing the Variance of Gaussian Process Hyperparameter Optimization
with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication.
We prove that preconditioning has an additional benefit that has been previously unexplored.
It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z) - Finding Global Minima via Kernel Approximations [90.42048080064849]
We consider the global minimization of smooth functions based solely on function evaluations.
In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
arXiv Detail & Related papers (2020-12-22T12:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.