Emergence of the SVD as an interpretable factorization in deep learning
for inverse problems
- URL: http://arxiv.org/abs/2301.07820v2
- Date: Wed, 9 Aug 2023 00:44:38 GMT
- Title: Emergence of the SVD as an interpretable factorization in deep learning
for inverse problems
- Authors: Shashank Sule, Richard G. Spencer and Wojciech Czaja
- Abstract summary: We demonstrate the emergence of the singular value decomposition (SVD) of the weight matrix as a tool for interpretation of neural networks.
We show that descrambling transformations can be expressed in terms of the SVD of the NN weights and the input autocorrelation matrix.
- Score: 1.5567671045891203
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Within the framework of deep learning we demonstrate the emergence of the
singular value decomposition (SVD) of the weight matrix as a tool for
interpretation of neural networks (NN) when combined with the descrambling
transformation--a recently-developed technique for addressing interpretability
in noisy parameter estimation neural networks \cite{amey2021neural}. By
considering the averaging effect of the data passed to the descrambling
minimization problem, we show that descrambling transformations--in the large
data limit--can be expressed in terms of the SVD of the NN weights and the
input autocorrelation matrix. Using this fact, we show that within the class of
noisy parameter estimation problems the SVD may be the structure through which
trained networks encode a signal model. We substantiate our theoretical
findings with empirical evidence from both linear and non-linear signal models.
Our results also illuminate the connections between a mathematical theory of
semantic development \cite{saxe2019mathematical} and neural network
interpretability.
Related papers
- Nonlinear classification of neural manifolds with contextual information [6.292933471495322]
manifold capacity has emerged as a promising framework linking population geometry to the separability of neural manifold.
We propose a theoretical framework that overcomes this limitation by leveraging contextual input information.
Our framework's increased expressivity captures representation untanglement in deep networks at early stages of the layer hierarchy, previously inaccessible to analysis.
arXiv Detail & Related papers (2024-05-10T23:37:31Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Learning Invariant Weights in Neural Networks [16.127299898156203]
Many commonly used models in machine learning are constraint to respect certain symmetries in the data.
We propose a weight-space equivalent to this approach, by minimizing a lower bound on the marginal likelihood to learn invariances in neural networks.
arXiv Detail & Related papers (2022-02-25T00:17:09Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Robust lEarned Shrinkage-Thresholding (REST): Robust unrolling for
sparse recover [87.28082715343896]
We consider deep neural networks for solving inverse problems that are robust to forward model mis-specifications.
We design a new robust deep neural network architecture by applying algorithm unfolding techniques to a robust version of the underlying recovery problem.
The proposed REST network is shown to outperform state-of-the-art model-based and data-driven algorithms in both compressive sensing and radar imaging problems.
arXiv Detail & Related papers (2021-10-20T06:15:45Z) - Bayesian neural networks and dimensionality reduction [4.039245878626346]
A class of model-based approaches for such problems includes latent variables in an unknown non-linear regression function.
VAEs are artificial neural networks (ANNs) that employ approximations to make computation tractable.
We deploy Markov chain Monte Carlo sampling algorithms for Bayesian inference in ANN models with latent variables.
arXiv Detail & Related papers (2020-08-18T17:11:07Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Estimating Model Uncertainty of Neural Networks in Sparse Information
Form [39.553268191681376]
We present a sparse representation of model uncertainty for Deep Neural Networks (DNNs)
The key insight of our work is that the information matrix tends to be sparse in its spectrum.
We show that the information form can be scalably applied to represent model uncertainty in DNNs.
arXiv Detail & Related papers (2020-06-20T18:09:59Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z) - Network Diffusions via Neural Mean-Field Dynamics [52.091487866968286]
We propose a novel learning framework for inference and estimation problems of diffusion on networks.
Our framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities.
Our approach is versatile and robust to variations of the underlying diffusion network models.
arXiv Detail & Related papers (2020-06-16T18:45:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.