Rethinking Neural-based Matrix Inversion: Why can't, and Where can
- URL: http://arxiv.org/abs/2506.00642v1
- Date: Sat, 31 May 2025 17:11:15 GMT
- Title: Rethinking Neural-based Matrix Inversion: Why can't, and Where can
- Authors: Yuliang Ji, Jian Wu, Yuanzhe Xi,
- Abstract summary: There currently exists no universal neural-based method for approximating matrix inversion.<n>This paper presents a theoretical analysis demonstrating the fundamental limitations of neural networks in developing a general matrix inversion model.<n>We explore the efficacy of neural networks in addressing the matrix inversion challenge.
- Score: 5.625854819595101
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks have achieved substantial success across various scientific computing tasks. A pivotal challenge within this domain is the rapid and parallel approximation of matrix inverses, critical for numerous applications. Despite significant progress, there currently exists no universal neural-based method for approximating matrix inversion. This paper presents a theoretical analysis demonstrating the fundamental limitations of neural networks in developing a general matrix inversion model. We expand the class of Lipschitz functions to encompass a wider array of neural network models, thereby refining our theoretical approach. Moreover, we delineate specific conditions under which neural networks can effectively approximate matrix inverses. Our theoretical results are supported by experimental results from diverse matrix datasets, exploring the efficacy of neural networks in addressing the matrix inversion challenge.
Related papers
- Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Implicit Regularization via Spectral Neural Networks and Non-linear
Matrix Sensing [2.171120568435925]
Spectral Neural Networks (abbrv. SNN) is particularly suitable for matrix learning problems.
We show that the SNN architecture is inherently much more amenable to theoretical analysis than vanilla neural nets.
We believe that the SNN architecture has the potential to be of wide applicability in a broad class of matrix learning scenarios.
arXiv Detail & Related papers (2024-02-27T15:28:01Z) - Random matrix theory and the loss surfaces of neural networks [0.0]
We use random matrix theory to understand and describe the loss surfaces of large neural networks.
We derive powerful and novel results about the Hessians of neural network loss surfaces and their spectra.
This thesis provides important contributions to cement the place of random matrix theory in the theoretical study of modern neural networks.
arXiv Detail & Related papers (2023-06-03T13:16:17Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - On the Approximation and Complexity of Deep Neural Networks to Invariant
Functions [0.0]
We study the approximation and complexity of deep neural networks to invariant functions.
We show that a broad range of invariant functions can be approximated by various types of neural network models.
We provide a feasible application that connects the parameter estimation and forecasting of high-resolution signals with our theoretical conclusions.
arXiv Detail & Related papers (2022-10-27T09:19:19Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Universal characteristics of deep neural network loss surfaces from
random matrix theory [0.5249805590164901]
We use universal properties of random matrices related to local statistics to derive practical implications for deep neural networks.
In particular we derive universal aspects of outliers in the spectra of deep neural networks and demonstrate the important role of random matrix local laws in popular pre-conditioning gradient descent algorithms.
arXiv Detail & Related papers (2022-05-17T19:42:23Z) - A Sparse Coding Interpretation of Neural Networks and Theoretical
Implications [0.0]
Deep convolutional neural networks have achieved unprecedented performance in various computer vision tasks.
We propose a sparse coding interpretation of neural networks that have ReLU activation.
We derive a complete convolutional neural network without normalization and pooling.
arXiv Detail & Related papers (2021-08-14T21:54:47Z) - LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop.
A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.