Related papers: Dynamic Spectral Backpropagation for Efficient Neural Network Training

Dynamic Spectral Backpropagation for Efficient Neural Network Training

URL: http://arxiv.org/abs/2505.23369v1
Date: Thu, 29 May 2025 11:47:50 GMT
Title: Dynamic Spectral Backpropagation for Efficient Neural Network Training
Authors: Mannmohan Muthuraman,
Abstract summary: Dynamic Spectral Backpropagation (DSBP) enhances neural network training under resource constraints by projecting gradients onto principal eigenvectors.<n>Five extensions are proposed to address challenges in robustness, fewshot learning, and hardware efficiency.<n>DSBP outperforms Sharpness Aware Minimization (SAM), Low Rank Adaptation (LoRA), and Model Agnostic Meta Learning (MAML) on CIFAR 10, Fashion MNIST, MedMNIST, and Tiny ImageNet.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dynamic Spectral Backpropagation (DSBP) enhances neural network training under resource constraints by projecting gradients onto principal eigenvectors, reducing complexity and promoting flat minima. Five extensions are proposed, dynamic spectral inference, spectral architecture optimization, spectral meta learning, spectral transfer regularization, and Lie algebra inspired dynamics, to address challenges in robustness, fewshot learning, and hardware efficiency. Supported by a third order stochastic differential equation (SDE) and a PAC Bayes limit, DSBP outperforms Sharpness Aware Minimization (SAM), Low Rank Adaptation (LoRA), and Model Agnostic Meta Learning (MAML) on CIFAR 10, Fashion MNIST, MedMNIST, and Tiny ImageNet, as demonstrated through extensive experiments and visualizations. Future work focuses on scalability, bias mitigation, and ethical considerations.

Related papers

NeuroVoxel-LM: Language-Aligned 3D Perception via Dynamic Voxelization and Meta-Embedding [8.131547418489534]
We propose NeuroVoxel-LM, a novel framework that integrates Neural Radiance Fields (NeRF) with dynamic resolution voxelization and lightweight meta-embedding.<n>Specifically, we introduce a Dynamic Resolution Multiscale Voxelization (DR-MSV) technique that adaptively adjusts voxel based on geometric and structural complexity.<n>We also propose the Token-level Adaptive Pooling for Lightweight Meta-Embedding (TAP-LME) mechanism, which enhances semantic representation through attention-based weighting and residual fusion.
arXiv Detail & Related papers (2025-07-27T03:11:08Z)
Toward accurate RUL and SOH estimation using reinforced graph-based PINNs enhanced with dynamic weights [0.0]
We propose a framework that combines physics-based supervision with advanced-temporal learning.<n>Q-learning agents dynamically assign weights to physics-informed loss terms, improving generalization across real-time industrial systems.<n>In both RUL and SOH estimation tasks, the proposed method consistently outperforms state-of-the-art models.
arXiv Detail & Related papers (2025-07-13T19:49:12Z)
Learning to Dissipate Energy in Oscillatory State-Space Models [55.09730499143998]
State-space models (SSMs) are a class of networks for sequence learning.<n>We show that D-LinOSS consistently outperforms previous LinOSS methods on long-range learning tasks.
arXiv Detail & Related papers (2025-05-17T23:15:17Z)
An Overview of Low-Rank Structures in the Training and Adaptation of Large Models [52.67110072923365]
Recent research has uncovered a widespread phenomenon in deep networks: the emergence of low-rank structures.<n>These implicit low-dimensional patterns provide valuable insights for improving the efficiency of training and fine-tuning large-scale models.<n>We present a comprehensive review of advances in exploiting low-rank structures for deep learning and shed light on their mathematical foundations.
arXiv Detail & Related papers (2025-03-25T17:26:09Z)
ResKoopNet: Learning Koopman Representations for Complex Dynamics with Spectral Residuals [1.8570740863168362]
Methods for approximating spectral components of high-dimensional dynamical systems often face theoretical limitations.<n>We introduce ResKoopNet, which explicitly minimizes the emphspectral residual to compute Koopman eigenpairs.<n>Experiments on a variety of physical and biological systems show that ResKoopNet achieves more accurate spectral approximations than existing methods.
arXiv Detail & Related papers (2025-01-01T02:19:42Z)
Approaching Deep Learning through the Spectral Dynamics of Weights [41.948042468042374]
spectral dynamics of weights -- the behavior of singular values and vectors during optimization -- to clarify and unify several phenomena in deep learning. We identify a consistent bias in optimization across various experiments, from small-scale grokking'' to large-scale tasks like image classification with ConvNets, image generation with UNets, speech recognition with LSTMs, and language modeling with Transformers.
arXiv Detail & Related papers (2024-08-21T17:48:01Z)
Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet) AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition. Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z)
Mitigating spectral bias for the multiscale operator learning [14.404769413313371]
We propose a hierarchical attention neural operator (HANO) inspired by the hierarchical matrix approach. HANO features a scale-adaptive interaction range and self-attentions over a hierarchy of levels, enabling nested feature computation with controllable linear cost. Our numerical experiments demonstrate that HANO outperforms state-of-the-art (SOTA) methods for representative multiscale problems.
arXiv Detail & Related papers (2022-10-19T21:09:29Z)
Momentum Diminishes the Effect of Spectral Bias in Physics-Informed Neural Networks [72.09574528342732]
Physics-informed neural network (PINN) algorithms have shown promising results in solving a wide range of problems involving partial differential equations (PDEs) They often fail to converge to desirable solutions when the target function contains high-frequency features, due to a phenomenon known as spectral bias. In the present work, we exploit neural tangent kernels (NTKs) to investigate the training dynamics of PINNs evolving under gradient descent with momentum (SGDM)
arXiv Detail & Related papers (2022-06-29T19:03:10Z)
Neural Operator with Regularity Structure for Modeling Dynamics Driven by SPDEs [70.51212431290611]
Partial differential equations (SPDEs) are significant tools for modeling dynamics in many areas including atmospheric sciences and physics. We propose the Neural Operator with Regularity Structure (NORS) which incorporates the feature vectors for modeling dynamics driven by SPDEs. We conduct experiments on various of SPDEs including the dynamic Phi41 model and the 2d Navier-Stokes equation.
arXiv Detail & Related papers (2022-04-13T08:53:41Z)
Neural Dynamic Mode Decomposition for End-to-End Modeling of Nonlinear Dynamics [49.41640137945938]
We propose a neural dynamic mode decomposition for estimating a lift function based on neural networks. With our proposed method, the forecast error is backpropagated through the neural networks and the spectral decomposition. Our experiments demonstrate the effectiveness of our proposed method in terms of eigenvalue estimation and forecast performance.
arXiv Detail & Related papers (2020-12-11T08:34:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.